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Preface 



The papers in this volume were presented at the Seventh Workshop on Algo- 
rithms and Data Structures (WADS 2001). The workshop took place August 
8-10, 2001 in Providence, Rhode Island, USA. The workshop alternates with 
the Scandinavian Workshop on Algorithms Theory (SWAT), continuing the tra- 
dition of SWAT and WADS starting with SWAT ’88 and WADS ’89. 

In response to the call for papers, 89 papers were submitted. From these 
submissions, the program committee selected 40 papers for presentation at the 
workshop. In addition invited plenary lectures were given by the following di- 
stinguished researchers: Mikhail J. Atallah, F. Thomson Leighton, and Mihalis 
Yannakakis. 

On behalf of the program committee, we would like to express our apprecia- 
tion to the invited speakers and to all the authors who submitted papers. 
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Approximation of Multiobjective Optimization 

Problems 



Mihalis Yannakakis 
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Abstract. We discuss problems in multiobjective optimization, in which 
solutions to a combinatorial optimization problem are evaluated with 
respect to several cost criteria, and we are interested in the trade-off 
between these objectives, the so-called Pareto curve. The Pareto curve 
has typically an exponential number of points. However, it turns out 
that, under general conditions, there is a polynomially succinct curve 
that approximates the Pareto curve within any desired accuracy. The 
central computational question is whether such an approximate curve 
can be constructed efficiently (in polynomial time). We discuss conditions 
under which this is the case. We examine in more detail the class of linear 
multiobjective problems, and relate the multiobjective approximation to 
the single objective case. We will discuss also problems in multiobjective 
query optimization. 
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Abstract. Given a set of n points in the plane, any /3-skeleton and 
[ 70 , 71 ] graph can be computed in quadratic time. The presented algo- 
rithms are optimal for /3 values that are less than 1 and [ 70 , 71 ] values 
that result in non-planar graphs. For /3 = 1, we show a numerically 
robust algorithm that computes Gabriel graphs in quadratic time and 
degree 2. We finally show how a /3-spectrum can be computed in optimal 
O(n^) time. 



1 Introduction 

A typical approach to extracting a shape from a given set P of n points is 
to compute a proximity graph of P, i.e. a geometric graph whose vertices are 
elements of P and where the edges are straight-line segments connecting pairs 
of points. In a proximity graph of P two points are connected by an edge if 
and only if their region of influence is empty, i.e. it does not contain any other 
element of P. The region of influence of two points u and u of P is a region of the 
plane that describes a neighbourhood of u and v; the emptiness of the region of 
influence witnesses that u and v are close enough to each other to be connected 
by an edge. Depending on the application context, different definitions of region 
of influence and of corresponding proximity graphs have been proposed in the 
literature. While the interested reader is referred to the comprehensive survey 
by Jaromczyk and Toussaint |2|, we restrict ourselves to recalling some of the 
most widely studied proximity graphs. 

* Research supported in part by the CNR Project “Geometria Computazionale Robu- 
sta con Applicazioni alia Grafica ed al GAD” , the project “Algorithms for Large Data 
Sets: Science and Engineering” of the Italian Ministry of University and Scientific 
and Technological Research (MURST 40%), by Gen. Gat. SGR000356 and MEG- 
DGES-SEUID PB98-0933, Spain, and by NSERC, Canada. 
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A continuous hierarchy of proximity graphs that includes Gabriel graphs and 
relative neighbourhood graphs as special cases was first defined in the compu- 
tational morphology context by Kirkpatrick and Radke m- The elements of 
this infinite family of proximity graphs are called (3-skeletons and are defined by 
considering a continuous family of regions of influence indexed by a single real 
positive parameter (3. The region of influence of two points u and v is called the 
(3 -neighbourhood of u and v. Its area is related to the value /3; as (3 approaches 0, 
the /3-neighbourhood of u and v approaches the line-segment (u,v), while as (3 
increases the /3-neighbourhood of u and v becomes larger. The /3-neighbourhood 
of two points can be either lune-based or circle-based. 

Another parametrized family of proximity graphs, known as ^-graphs which 
unifies circle-based /3-skeletons, convex hulls, and Delaunay triangulations was 
defined by Veltkamp m- A precise definition of /3-skeletons and y-graphs will 
be given in Section 0 

This paper is devoted to the study of efficient algorithms for computing /3- 
skeletons and y-graphs. In order to better explain our contribution, we briefly 
review some literature about proximity graphs and then list our results. Existing 
algorithms that compute proximity graphs can be classified according to whether 
they assume the Fixed Proximity Scenario or whether they assume the Variable 
Proximity Scenario. 

Fixed Proximity Scenario: In the fixed proximity scenario the input of the 
problem is a set of points and a definition of closeness between pairs of 
points; the output is a geometric graph such that two vertices are adjacent 
if and only if they satisfy the given definition of proximity. For example, a 
typical problem concerning /3-skeletons in this scenario is as follows: Given 
a set P of n points and a real (3 , efficiently compute either the lune-based 
or the circle-based /3-skeleton of P. 

Variable Proximity Scenario: In this scenario it is not known a priori what 
closeness measure to use: the application context requires to consider several 
different definitions of closeness in order to choose the best suited one . In 
this scenario, the input of the problem is a set of points and a set of definitions 
of closeness between pairs of points; the output is a set of geometric graphs 
describing the different definitions of proximity. 

In the Fixed Proximity Scenario, optimal 0(n log n) time algorithms are 
known for computing lune-based (3 skeletons when ( 3=1 and when (3 = 2 
For 1 < /3 < 2, an optimal 0(n log n) time algorithm is described in [HJ- To our 
knowledge, there exist only suboptimal algorithms that compute the lune-based 
/3-skeleton when 0 < /3 < 1: the fastest algorithm that we know for this problem 
requires 0(n^-®logn) time [IS|. For values of (3 in the range (2,oo), a (subop- 
timal) 0{n^) time algorithm for lune-based /3-skeletons is described in [iUll4[ . 
As for circle-based /3-skeletons, an optimal O(nlogn) time algorithm is given 
in mu for all values of (3 in the interval [1, oo], while the same log n) time 

algorithm of uni applies for values of (3 in the interval [0,1). The problem of 
computing y-graphs has been studied in CH!, where an O(n^) time algorithm 
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is described for the general case. Under the assumption that no four points are 
cocircular and that the chosen value of 7 gives rise to a 7-graph with no edge 
crossings, an optimal 0(n log n) time algorithm is also presented in |TH) . 

In the Variable Proximity Scenario, a key problem is that of encoding the 
entire spectrum of the empty neighbourhoods that can be found in the point set. 
Given this information, it is easy to compute different proximity graphs with an 
output sensitive strategy. Suppose for example that one wants to compute all 
possible (circle-based or lune-based) /3-skeletons for values of /3 in a given range. 
If the /3i-skeleton of P has already been computed and the next value of j3 to 
consider is P 2 > Pi, one would like to construct the /32-skeleton by a sequence of 
edge deletions from the /3i-skeleton (since the /32-neighbourhood contains the /31- 
neighbourhood, it follows that the /32-skeleton is a subgraph of the /3i -skeleton). 
A technique for efficiently solving this problem is based on a pre-processing step 
that computes for each pair u,v oi a, set of points P the largest value P*{u,v) 
of the parameter /3 such that u,v are adjacent in the /3*(it, u)-skeleton. The set 
of all these maximal values computed for each pairs of points of P is called 
the P-spectrum of the set of points. Once the /3-spectrum of a set P of points 
is known, it is possible to scan all the /3-skeletons of P by starting with the /3 
skeleton with /3 = 0 and by repeatedly removing the edge with the next smallest 
P*{u,v) value. A first O(n^) time algorithm for computing the /3-spectrum of 
a set of n points is given in HH; the algorithm can be used both for the case 
that the /3-neighbourhood is lune-based and the case that it is circle-based. Very 
recently, a new algorithm for the computation of the lune-based /3-spectrum was 
presented in uni- The algorithm in m requires 0{rP log n + p) time, where p 
is a parameter that depends upon the geometry of the point set and it can be 
0{rP) for some problem instances {p is the size of the so called witness set; see 
the cited paper for more details). 

The contribution of this paper is twofold. We present two simple algorithmic 
strategies that can be applied to solve a number of proximity problems both 
in the Fixed Proximity Scenario and in the Variable Proximity Scenario. The 
application of these techniques led to optimal algorithms for the computation of 
dense /3-skeletons and 7-graphs in the Fixed Proximity Scenario and to breaking 
the 0{rP) barrier for the computation of /3-spectra in the Variable Proximity 
Scenario. It also led to the first robust algorithm for computing Gabriel graphs 
which uses only double precision arithmetic and requires o(n^) time. For a formal 
definition of the adopted model of computation which takes into account the 
arithmetic precision and for other results where this model is used see, e.g., |31 
npif I ] . Our results can be listed as follows. 



— We exhibit an 0(n^)-time algorithm for the computation of the (lune-based 
or circle-based) /3-skeleton of a set of n points in the plane. This is worst 
case optimal when 0 < /3 < 1 since for these values of /3 the /3-skeleton can 
have 0{n^) edges. As described above, the previously known time bound for 
this problem is 0{rP'^ logn) pi 0) . 
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— We extend our technique to the computation of 7 -graphs and present an 
optimal 0{v?) time algorithm for the computation of this family of graphs 
in the Fixed Proximity Scenario. The previously known bound is O(n^) |18| . 

— We give an 0(n^ )-time algorithm for the computation of the circle-based /3- 
spectrum. Our algorithm is optimal, (the size of the /3-spectrum is n^) and 
improves over the previously known O(n^) time algorithm pi 1 )| . 

— We extend the technique of the previous item and obtain an optimal 0(n^)- 
time algorithm for the computation of the lune-based /3-spectrum. This result 
improves over the 0(n^ logn -|- p)-time algorithm in [flip. 

— As a further application of our technique we show the first algorithm that 
computes the Gabriel Graph of a set of n points in 0{n?) time and requires 
only double precision integer arithmetic computations. We remark that the 
optimal time 0 (nlogn)-time algorithm described in the literature relies on 
the computation of the Delaunay triangulation which may require an arith- 
metic precision four times the one used to represent the input data |^. 

For reasons of space, some proofs have been omitted in this extended abstract. 

2 Preliminaries 

In this section we introduce some notation and definitions that we use in sub- 
sequent sections. Let P be a set of points in the plane and let u and v be two 
points of P. The Euclidean distance between u and v is denoted by d(u, v). The 
arc (rt, v) is the directed line segment from u to v, and the edge (rt, v) is the 
undirected line segment from it to u. Gonsider a circle of radius r through u and 
V and let C be one of the two circular arcs connecting it to u on on the circle. 
We now define rules for associating a real value with C and a real value with 
the pair it, v. 

We associate with C a value that we call the (3-value of C. If C is smaller 
than Trr, we define the (3 value of C as d{u,v)/{2r). If C is at least as large as 
Trr, the (3 value of C is set equal to 2r/d{u,v). If r = 00 , then one of the two 
circular arcs connecting u to v coincides with the edge (it, v) and has (3 value 
equal to 0, while the other circular arc has (3 value equal to 00 . 

For a given value of (3 let G/j be the circular arc of value (3 that lies to the 
right of the arc (it, v) and let D be the disk such that Cp is a portion of its 
circumference. We call the portion of D bounded by edge (it, v) and Cp the right 
(3-region of 11 , v and denote is as Gr-(it, v, (3). The region Cr{u, i>, (3) is assumed to 
include the edge ( 11 , v) but not Cp. For (3 = 00 , we define Cr{u, v, 00 ) as the open 
half plane to the left of the arc (u,v), plus the edge (u,v). Finally Cr{u,v,0) 
is the empty set. See Figure Q for an illustration. Similarly we define the left 
(3-region of u,v and denote it as Ci{u,v,(3). We say that the right or the left 
/3-region of two points li, u in a point set P is empty if it does not contain any 
point of P other than u and v. For example, in Figure Q both Ci{u,v, and 

Cr{u,v,l) are empty. 

We associate with it, u a value that we call the right (3-value of it, v and 
denote it as (3r{u, v). The right /3- value of it, v is the largest value of (3 such that 
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Fig. 1. The right /3-value, the left /3-value, and the /3-value of two points u and ii in a 
point set. 



Cr{u,v,/3) is empty. Similarly we define the left (3-value of u,v and denote it as 
Pi{u, v). The (3-value of u, v is the minimum of Pi{u, v) and (3r{u, v) and is denoted 
as (3{u,v). For example, the right /3-value of u,v in Figure[I]is (3r{u,v) = 1, the 
left /3- value is (3i{u,v) = and the /3- value is /3 = 

Property 1. Let u,v he a, pair of points in a point set P; we have (3r{u,v) = 
(3i{v,u) and Cr{u,v,(3) = Ci(v,u, P). 

The P-spectrum of a set P of points is the set of all pairs of points of P where 
each pair is labeled with its /3-value. The circle-based P -neighbourhood mu of two 
points u and t; is equal to (//((it, t), /3)UC'r(w, w, /3). So the circle-based P skeletonoi 
P is a proximity graph such that an edge {u, v) belongs to the graph if and only 
if Ci{u, V, P) U Cr{u, V, P) is empty. The following property relates /3-skeletons to 
right and left /3-values. 

Property 2. Let P be a set of points in the plane, let u, n be a pair of points of 
P, and let /3 be a real positive value such that 0 < /3 < oo. The /3-region of u 
and V is empty if and only if the right and the left /3- values oi u^v are such that 
Pr(u,v) > P and Pi{u,v) > p. 

If for example we set /3 = | and look at points u and v in Figure m we can 
conclude by Property Q that (u,v) is an edge of the /3-skeleton; if we set /3 > 
then u and v are not adjacent in the /3-skeleton of the set of points. 

For 0 < /3 < 1, the lune-based /3 neighbourhood is the same as the circle-based 
P neighbourhood. For /3 > 1, the lune-based /3-neighbourhood of points u and v 
is the intersection of the two disks through u and v of radius Pd{u,v)/2. 

In 1 1 Veltkamp introduces ^-graphs as a parametrized family of proximity 
graphs which include circle-based /3-skeletons as special cases and are defined in 
terms of two j -parameters, named 70 and 71 . The 7 -graph of P is also denoted 
as the [70, 7 i]-graph of P to explicitly mention the two 7 -parameters. Let 70 
and 7 i be two real values such that —1 < | 7 o| < | 7 i| < 1 and let u,v he a pair 
of points of P. The two values 70 and 71 define two ['-^rhhV'neighbourhoods of 
u,v. The [70, 7 i]-graph of P is a proximity graph such that any two vertices 
u,v are adjacent if and only if at least one of the two [70, 71 ] -neighbour hoods 
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of u,v is empty. The two [ 70 , 71 ] -neighbourhoods of u and v can be defined 
in terms of left and right /3-regions: the [ 70 , 71 ] -neighbour hoods of u and v are 
Ci{u, V, (3o)UCr{u, V, /3i) and Cr{u, v, /3 q)UCi{u, v, /3i), where the correspondence 
between the pair (/3o,/3i) and ( 70 , 71 ) is as follows. If 7 i > 0 then /3j = 1/(1 — 7 ^) 
and if 7 i < 0 then /3^ = 1 -|- 7 i for / = 0 , 1 . 

3 Algorithms for the Fixed Proximity Scenario 

In this section we first present a simple algorithm for the following problem. Let 
P be a set of points in the plane and let /3 be a given positive real number. For 
each pair of points it, u of P we ask ourselves whether or not the right /3-region of 
u,v contains an element of P. We call this problem the right [3-region problem. 
We shall exploit our solution to the right /3-region problem to efficiently compute 
/3-skeletons, 7 -graphs, and Gabriel graphs with low arithmetic degree. 

3.1 The Right /3-Region Problem 

Our algorithm for the right /3-region problem assigns each arc (it, v) a label 
according to whether the right /3-region of u,v is empty or not. The arc (it, u) 
is labeled Yes if Cr{u,v,/3) contains no points of P, it is labeled No otherwise. 
A technique similar to ours has been used by Beirouti and Snoeyink |2| for 
computing triangles emptiness in the context of LMT heuristics for minimum 
weight triangulations. A precise description of our algorithm is as follows. 
Algorithm Right /3-region 

Step 1 : For each point u G P compute an ordered list of the remaining points, 
sorted in radial order around it in clock-wise direction. Do step 2 for each 
point u G P. 

Step 2: Let po be the point closest to it. Let po,Pi, ■ • • ,Pn -2 be the ordered 
sequence of points of P— {it} radially sorted around it in clock-wise direction. 
For 1 = 0, 1, 2, . . . , n — 2 do: (arithmetic in indices is done (mod n — 1) ): 
Step 2.1: — If Pi+i ^ Cr{u,Pi,P) then label the arc {u,Pi) with a label 

Maybe. 

— Ifpi+i G Cr{u,pi, [3) then do the following: 

1. Label (u,pi) with a label /Vo; 

2. Set k = i 

3. Determine j such that j < k and such that j is the largest 
index for which (u,pj) has the label Maybe. If j >0 and if 
Pi+i G Cr{u,pj, P) then label (u,pj) with No, set k = j and 
repeat this instruction, else this instruction is done. 

Step 2.2: Scan again po, Pi, , 7 * 71 - 2 ; for each arc (u,Pi) (0 < i < n — 2) 
labeled Maybe change its label to Yes 

End of Algorithm Right /3-region 

The following lemmas proof the correctness of the above algorithm and show 
that its time complexity is O(n^). The proof of the first lemma follows from 
elementary geometry. 
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Lemma 1. If i < j and the angle between arcs (u,pi) and {u,pj) is less than tt 
and if pj ^ Cr{u,pi, j3) then Cr{u,pi, P) r\Cr{u,pj,oo) C Cr{u,pj,P). 



Lemma 2. After Algorithm Right /3-region has performed Step 2.1 for a 
particular value of i, we have that for each value of h with 0 < h < i for 
which arc {u,ph) is labeled Maybe, Cr{u,ph, P) does not contain any points of 
{Ph+l,Ph+2, ■ ■ ■ ,Pi+l}- 



Theorem 1. Let P be a set of n points in the plane. Algorithm Right 
/3-region solves the Right P-region Problem for P in 0{n^) time. 

Proof. Correctness follows from Lemma 0 and from the fact that since po is the 
point closest to u, Cr{u,Pn- 2 , P) is non-empty if and only if pq G /3). 

The radial sorting of Step 1 can be done in 0(n'^) time by the algorithm of fTT?] 
or the alternative algorithm in jjj. If for each point pi we maintain a pointer to 
the last encountered arc with a Maybe label, Step 2.1 will require 0{n) time. 
Step 2.2 trivially requires 0(n) time. Therefore, Step 2 requires 0{n) time 
and since it is executed n times, Algorithm Right /3-region has an O(n^) 
time complexity. 

3.2 Computing Proximity Graphs 

We will now use Algorithm Right /3-region to compute proximity graphs. We 
first consider /3-skeletons for values of /3 such that 0 < /3 < 1. In this interval the 
/3-neighbourhood is the same for lune-based and circle-based graphs; therefore 
in the statement of the next theorem we do not distinguish between lune-based 
and circle-based /3-skeletons. 

Theorem 2. Let P be a set of n points in the plane and let 0 < P < 1. There 
exists an algorithm that computes the circle-based and the lune-based P-skeleton 
of P in optimal 0{n^) time. 

Proof. If /3 = 0, the /3-skeleton of P is the complete graph and can be easily 
computed in 0{n'^) time by connecting all pairs of points. If /3 > 0, we compute 
the /3-skeleton of P as follows. We first execute Algorithm Right /3-region on 
P. Note that the definition of right /3-region of u, v implicitly contains a notion 
of direction from u to u and that by Property Q u, /3) = Ci{v,u, P). Then 
we compute all edges {u, v) of the /3-skeleton by checking if both the arc (u, v) 
and the arc (v,u) have been labeled Yes by Algorithm Right /3-region. The 
correctness of the algorithm is a consequence of Property 0 and Theorem 0 The 
bound on the time complexity is a consequence of Theorem 0 



Theorem 3. Let P be a set of n points in the plane and let 70 , 7i be a pair 
of real values such that —1 < I 70 I < | 7 i| < 1. There exists an algorithm that 
computes the graph of P in optimal 0{n^) time. 
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Proof. We compute the [ 70 , 71 ] -graph of P by the following procedure. First the 
value of /So and the value of ,Si corresponding to 70 and 71 are computed (See 
Section|21). We execute Algorithm Right /3-region on P twice: once for P = Pq 
and a second time for /3 = /3i . A pair u, v of points is connected by an edge in 
the [ 70 , 7 i]-graph if one of the following two events happens: (i) The arc (u,v) 
has been labeled Yes by Algorithm Right /3-region when P = Pq and (u,u) 
has been labeled Yes when /3 = Pi] or (ii) The arc (u,v) has been labeled Yes 
by Algorithm Right /3-region when /3 = /3i and (v, u) has been labeled Yes 
when P = Pq. The bound on the time complexity is a consequence of Theorem 

m 



As a further application of Algorithm Right /3-region, we observe that it 
can also be used to compute circle-based /3-skeletons for values of /3 such that 
/3 > 1. When /3 = 1 and when the /3 neighbourhood is a closed set, the /3-skeleton 
of P coincides with the Gabriel graph of P. It can be shown that Algorithm 
Right /3-region can be slightly modified to compute the Gabriel graph of a set 
of points in 0{rP) time. This result is not very surprising on its own since an 
optimal 0(n log n) algorithm for Gabriel graphs (and also for every circle-based 
/3-skeleton with /3 > 1) is known iHnni- However, the status of affairs changes if 
we revisit existing algorithms in terms of their robustness. 

The optimal algorithm for Gabriel graphs is based on first computing the 
Delaunay triangulation and then deleting the edges that are not Gabriel edges. 
The implementation of a numerically robust code for Delaunay triangulation 
requires to evaluate the sign of irreducible polynomials of algebraic degree 4 
(see, e.g. 0 ), thus requiring a numerical precision four times the one required 
for representing the input data. On the other hand, it is a trivial task to compute 
the Gabriel graph of a set of points by simply comparing Euclidean distances in 
0{nP) time. In this case, the algebraic degree of the polynomials to evaluate is 
only 2, which can be shown to be optimal. Therefore, we can identify a trade-off 
between time-complexity and numerical precision required by algorithms that 
compute Gabriel graphs. In the next theorem we show how to reduce the trade- 
off. 

We adopt degree model of computation introduced in m and analyze the 
performance of Algorithm Right /3-region in terms of required numerical pre- 
cision to compute Gabriel graphs. In this model of computation the robustness 
of a geometric algorithm is evaluated by looking at the irreducible polynomial 
of highest algebraic degree whose sign is evaluated by the algorithm during its 
execution. This quantity is called the degree of the algorithm and is a measure of 
the numerical precision that a robust implementation would require to guarantee 
correct outputs independent of degenerate configurations of the input data. By 
analyzing the algebraic degree of the geometric tests in the elegant and simple 
algorithm by Overmars and Welzl m we can conclude the following. 

Lemma 3. Let P he a set of n points in the plane. The algorithm by Overmars 
and Welzl m solves the problem of reporting for each u ^ P all points of P—{u} 
radially sorted around u in 0{n^) time and degree 2. 
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When /3 = 1, the geometric tests performed by Step 2 of Algorithm Right 
/3-region take into account triplets of input points, say p, g, r, compute the 
centre c of the disk having p, q as antipodal points, and compare d(p, c) with 
d(r, c). This corresponds to comparing numbers of at most 2(6+ 1) bits where 6 
bits is the input precision. We can summarize this discussion as follows. 

Theorem 4. Let P be a set of n points in the plane. There exists an optimal 
degree 2 algorithm that computes the Gabriel graph of P in O(n^) time. 

4 Algorithms for the Variable Proximity Scenario 

In this section we consider the problem of computing the /3-spectrum of a set 
of points. Before we present the algorithm for computing the /3-spectrum, we 
introduce some new notation and a few lemmas. After that we are in a position 
to present the algorithm and its proof of correctness. 

Let u, V and w be three points in the plane and let C(u,v,w) be the disk 
through u, V and w. We denote with C"(u, u, w) the subset of C(u, v, w) bounded 
by chord (u,w) and the circular arc from u, through v ending at w. 

Let p be an arbitrary point and let C be a circle through p. Let Iq be a 
line through p and tangent to C. Assume that Iq is horizontal and that C lies 
above Iq. Let B = {po,Pi,P 2 , • • • Pk,Pk+i} be a set of points that lie above Iq. 
Assume that points po,pi, . . . ,pk are counter clock- wise radially sorted around 
p. Moreover assume that Pq lies on C and that the remaining points of B lie on 
or outside C. Let k be the line tangent to C{p,pi-i,pi) through p. Let ai be the 
top right angle between Iq and li. We say that {pQ,pi,p 2 , . . . Pfc} is an increasing 
set if Oi < Q !2 < • ■ ■ < ctfc- An increasing set of points is depicted in Figure |21 
In Lemmas HQ we denote with I the line through p and pk and with ai 
the highest intersection point between I and circle C{p,pi-i,pi). The highest 
intersection point of C and I is called Uq. Also, for any two points r and s on 
line I we say that r < s if d{p,r) < d{p,s). Notice that if {pQ,p\,p 2 , . . . ,Pk} is 
an increasing set of points, we have oq < a\ < . . . < ak. 

Lemma 4. Let P = {p,pQ,pi, . . . ,pk} be a set of points such that 
{PojPij • ■ • jPk-i} o,n increasing set. Let i be the largest index such that 
Cr{p,Pk,Pr{p,Pk)) = C'{p,pi,pk). Thenpj-i G C'{p,pj,pk) for i < j < k and 
Pj+i i C'{p,pj,pk) for i < j < k -1. 

Lemma 5. Let P = {p,Pq,Pi, . . . ,Pk} be a set of points such that 

{pq,Pi, . . . ,Pk-i} is an increasing set. Let i be the smallest index for which 

Cr{p,Pk,Pr{p,Pk)) = C'{p,pi,pk). Thenpj+i G C'{p,pj,pk) and 

Pj-i i C'{p,pj,pk) for 0<j<i. 



Lemma 6. Let P = {p,pQ,pi, . . . ,pk} be a set of points such that 

{po,Pi, • ■ • ,Pfc-i} is an increasing set. Lf Cr(p,Pk, /3r{p,Pk)) = C'{p,pi,pk) then 

{pq,Pi, . . . ,pi,pk} is an increasing set of points. 
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Fig. 2. Increasing set {po,Pi, • • • ,P4}- 



Lemma 7. Let P = {p,po,pi, . . . ,pk,Pk+i} be a set of points such that 
{po,Pi, ■ ■ • ,Pk-i} is an increasing set. If Cr{p,Pk, (3r{p,Pk)) = C'{p,pi,pk) then 
{Pi,Pk} n C'{p,pj,pk+i) for i < j < k. 

We are now in a position to describe our algorithm for computing the 
cricle-based /3-spectrum of a set of points. We start with an algorithm for 
an easier problem that we call the right 13-value problem for a point p . Let 
P = {PjP 0 )P 1 ) • ■ • jPn-i} be a set of points in the plane, let C be a circle through 
p and po &nd let I be the line through p tangent to C. Assume that all points 
in {po,Pi, ■ ■ ■ ,Pn-i} lie outside or on C, and on the same side of / as C does. 
Also assume that points po,pi, . . . ,Pn-i are radially sorted around p in counter 
clock- wise direction. The right /3- value problem for p is to compute the right 
/3- values f3r{p,Pi) for 1 < i < n — 1. 

Algorithm Right /3-values 

Step 1: Set B = {po,pi}; compute and record f3r{p,Pi)- 
Step 2 : For k = 2, 3, . . . , (n — 1) do the following: 

Step 2.1: Compute point pi € B such that Cr{p,Pk, Pr{p,Pk)) = 
C'{p,p^,Pk)■ 

Step 2.2: Record /3r(p,Pfc). 

Step 2.3: Set 5 = {po,Pi, • ■ • ,K,Pfc}- 

End of Algorithm Right /3-values 
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Lemma 8. Let P = {p,po,pi, . . . ,pn-i} be a set of points in the plane as de- 
scribed above. Algorithm Right /3-values solves the right (3-value problem in 
0{n) time. 

Proof. The correctness of Algorithm Right /3-values follows from the follow- 
ing observations. Initially the set B is an increasing set. Lemma 0 shows that 
after Step 2 . 3 set B is still increasing. Moreover, the elements of P that are 
not in B are not relevant when we compute (3r{p,Pk) in Step 2. 1, as it is shown 
in Lemma 0 Finally, Step 2.1 can be performed with a sequential scan of the 
elements of B. Lemmas EJ and 0 show that the amortized cost of Step 2 is 0{n). 

The next lemma shows how to compute the right /3-values for all pairs of 
points in a point set. 

Lemma 9. Let P be a set of n points in the plane. There exists an algorithm 
that computes the right (3-value for each pair of points of P in optimal 0{n^) 
time. 

Proof. The set of right /3- values is computed as follows. We first construct the 
Delaunay triangulation of p [ini. Let (a, 5, c) be a triangle in this triangulation. 
Without loss of generality assume that (a, &, c) is the clock-wise order of these 
points around the triangle. Since C{a,b,c) contains no points from P in it in- 
terior, we can immediately compute (3r{a,b), (3r{b,c) and j3r{c,a). Secondly we 
radially sort P — {p} around p for each point p of P Let {P 1 ,P 2 , . ■ . ,Pm} be 
the set of points in Ci(b, a, oo)r\Cr{b, c, oo), radially sorted in counter clock-wise 
order around b. We execute Algorithm Right /3-values to obtain (3r{b^pf) for 
1 < / < TO. Repeating this for all three corners of all Delaunay triangles gives us 
the Pr{u,v) for all pairs of points {u,v) of P. The correctness of the algorithm 
follows from the correctness of Algorithm Right /3-values (Lemma 0. 

Concerning the time complexity of our algorithm, we observe that computing 
the Delaunay triangulation of P and the radially sorted lists can be done in 0{n^) 
time; since Algorithm Right /3-values is executed 0{n) times and since each 
execution requires 0{n) time by Lemma0, it follows that the computational cost 
of the algorithm is 0{n^). Since there are pairs of points there are right 
/3-values to compute and the time complexity of the algorithm is asymptotically 
optimal. 

The results above easily imply the main result of this section. 

Theorem 5. Let P be a set ofn points in the plane. There exists an algorithm 
that computes the circle-based /3-spectrum of P in optimal 0{n^) time. 

We can extend the results of Theorem 0 to compute the lune-based (3- 
spectrum of a set of points. 

Theorem 6. Let P be a set ofn points in the plane. There exists an algorithm 
that computes the lune-based (3-spectrum of P in optimal 0{n^) time. 
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Abstract. We give linear-time quasiconvex programming algorithms for 
finding a Mobius transformation of a set of spheres in a unit ball or 
on the snrface of a unit sphere that maximizes the minimum size of a 
transformed sphere. We can also use similar methods to maximize the 
minimum distance among a set of pairs of input points. We apply these 
results to vertex separation and symmetry display in spherical graph 
drawing, viewpoint selection in hyperbolic browsing, element size control 
in conformal structured mesh generation, and brain flat mapping. 



1 Introduction 

Mobius transformations of d-dimensional space form one of the fundamental 
geometric groups. Generated by inversions of spheres, they preserve spherical 
shape as well as the angles between pairs of curves or surfaces. We consider here 
problems of finding an optimal Mobius transformation: 

— Given a set of (d — l)-dimensional spheres in a d-dimensional unit ball, find 
a Mobius transformation that maps the ball to itself and maximizes the 
minimum radius among the transformed spheres. 

— Given a set of (d — l)-dimensional spheres on a sphere find a Mobius 
transformation of that maximizes the minimum radius among the trans- 
formed spheres. 

— Given a graph connecting a set of vertices on or in the unit ball in 
find a Mobius transformation that maximizes the minimum edge length. 

We develop efficient algorithms for these problems, by formulating them as 
quasiconvex programs in hyperbolic space. The same formulation also shows 
that simple hill-climbing methods are guaranteed to find the global optimum; 
this approach is likely to work well in practice as an alternative to our more 
complicated quasiconvex programs. We apply these results to the following areas: 

— Spherical graph drawing [15]. Any embedded planar graph can be repre- 
sented as a collection of tangent circles on a sphere S^; this representation 
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is unique for maximal planar graphs, up to Mobius transformation. Our al- 
gorithms can find a canonical spherical realization of any planar graph that 
optimizes the minimum circle radius or the minimum separation between two 
vertices, and that realizes any symmetries implicit in the given embedding. 

— Hyperbolic browsing [16]. The Poincare model of the hyperbolic plane has 
become popular as a way of displaying web sites and other graph models too 
complex to view in their entirety. This model permits parts of the site struc- 
ture to be viewed in detail, while reducing the size of peripheral parts. Our 
algorithms can find a “central” initial viewpoint for a hyperbolic browser, 
that allows all parts of the sites to be viewed at an optimal level of detail. 

~ Mesh generation [3,24]. A principled method of structured mesh generation 
involves conformal mapping of the problem domain to a simple standardized 
shape such as a disk, construction of a uniform mesh in the disk, and then 
inverting the conformal mapping to produce mesh elements in the original 
domain. Mobius transformations can be viewed as a special class of conformal 
maps that take the disk to itself. Our algorithms can find a conformal mesh 
that meets given requirements of element size in different portions of the 
input domain, while using a minimal number of elements in the overall mesh. 

— Brain flat mapping [13]. Hurdal et al. proposed a system for visualizing con- 
voluted brain surfaces, by approximate conformal mapping of those surfaces 
to a Euclidean disk, sphere, or hyperbolic plane. Our algorithms can find a 
conformal mapping that minimizes the resulting areal distortion. 

We assume throughout that the dimension d of the spaces we deal with is a 
constant; most commonly in our applications, d = 2 or d = 3. We omit many 
details in this extended abstract; see the full paper [2] for details. 

2 Preliminaries 

2.1 Mobius Transformation and Hyperbolic Geometry 

An inversion of the set U {oo}, generated by a sphere C with radius r, maps 
to itself every ray that originates at the sphere’s center. Each point is mapped to 
another point along the same ray, so that the product of the distances from the 
center to the point and to its image equals r^. The center of C is mapped to oo 
and vice versa. An inversion maps each point of C to itself, transforms spheres to 
other spheres, and preserves angles between pairs of curves or surfaces. Repeating 
an inversion produces the identity mapping on R'’* U {oo}. The set of products of 
inversions forms a group, the group of Mobius transformations on the Euclidean 
space E'’*. If we restrict our attention to the subgroup that maps a given sphere 
to itself, we find the group of Mobius transformations on 
Although our problem statements involve Euclidean and spherical geometry, 
our solutions involve techniques from hyperbolic geometry [14], and in particular 
the classical methods of embedding hyperbolic space into Euclidean space: the 
Poincare model and the Klein model. 
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Fig. 1. Poincare (left) and Klein (right) models of the hyperbolic plane. Analogous 
models exist for any higher dimensional hyperbolic space. 



In both the Poincare and Klein models, the d-dimensional hyperbolic space 

is viewed as an open unit ball in a Euclidean space while the unit sphere 
bounding the ball forms a set of points “at infinity” . In the Poincare model (Fig- 
ure 1, left), the lines of the hyperbolic space are modeled by circular arcs of the 
Euclidean space, perpendicular to the unit sphere. Hyperplanes are modeled by 
spheres perpendicular to the unit sphere, and hyperbolic spheres are modeled 
by spheres fully contained within the unit ball. Two more classes of surfaces 
are also modeled as Euclidean spheres: hyperspheres (surfaces at constant dis- 
tance from a hyperplane) are modeled by spheres crossing the unit sphere non- 
perpendicularly, and horospheres are modeled by spheres tangent to the unit 
sphere. The Poincare model thus preserves spherical shape as well as the angles 
between pairs of curves or surfaces. Any horosphere or hypersphere divides 
into a convex and a nonconvex region; we define a horoball or hyperball to be the 
convex region bounded by a horosphere or hypersphere respectively. 

In the Klein model (Figure 1, right), the lines of the hyperbolic space map to 
line segments of the Euclidean space, formed by intersecting Euclidean lines with 
the unit ball. Hyperplanes thus map to hyperplanes, and convex bodies map to 
convex bodies. Although the Klein model does not preserve spherical shape, it 
does preserve flatness and convexity. In the Klein model, spheres are modeled 
as ellipsoids contained in the unit ball, horospheres are modeled as ellipsoids 
tangent at one point to the unit ball, and hyperspheres are modeled as halves of 
ellipsoids tangent in a (d — l)-sphere to the unit ball. 

The Poincare and Klein models are not intrinsic to hyperbolic space, rather 
there can be many such models for the same space. The choice of model is 
determined by the hyperbolic point mapped to the model’s center point, and 
by an orientation of the space around this center. We call this central point the 
viewpoint since it determines the Euclidean view of the hyperbolic space. 

The connection between hyperbolic space and Mobius transformations is this: 
the isometries of the hyperbolic space are in one-to-one correspondence with the 
subset of Mobius transformations of the unit ball that map the unit ball to itself, 
where the correspondence is given by the Poincare model [14, Theorem 6.3]. Any 
hyperbolic isometry (or unit-ball preserving Mobius transformation) can be fac- 
tored into a hyperbolic translation mapping some point of the hyperbolic space 
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to the viewpoint, followed by a rotation around the viewpoint [14, Lemma 6.4]. 
Since rotation does not change the Euclidean shape of the objects on the model, 
our problems of selecting an optimal Mobius transformation of a Euclidean or 
spherical space can be rephrased as finding an optimal hyperbolic viewpoint. 



2.2 Quasiconvex Programming 

The viewpoint we seek in our optimal Mobius transformation problems will be 
expressed as the pointwise maximum of a finite set of quasiconvex functions; 
that is, functions for which the level sets are all convex. To find this point, 
we use a generalized linear programming framework of Amenta et al. [1] called 
quasiconvex programming. 

Define a nested convex family to be a map K{t) from nonnegative real numbers 
to compact convex sets in such that iit <t' then nft) C n{t'), and such that 
for all t, n{f) = fjt'x ^(^0- nested convex family n determines a function 
/k(x) = inf { t I a; e nff) } on with level sets that are the boundaries of nff). 
If /k does not take a constant value on any open set, and if n{t') is contained 
in the interior of nff) for any t' < t, we say that k is continuously shrinking. 
Conversely, the level sets of any quasiconvex function form the boundaries of the 
convex sets in a nested convex family, and if the function is continuous and not 
constant on any open set then the family will be continuously shrinking. 

If S = {ni, . . . fin} is a set of nested convex families, and A C S', let 

/(A) = inf I (t, x) I a; G fj K*(t)| 

*■ KiGA 

where the infimum is taken in the lexicographic ordering, first by t and then by 
the coordinates of x. Amenta et al. [1] defined a quasiconvex program to be a 
finite set S of nested convex families, with the objective function / described 
above, and showed that generalized linear programming techniques could be used 
to solve such programs in linear time for any constant dimension. Due to the 
convexity-preserving properties of the Klein model, we can replace E'* by in 
the definition of a nested convex family without changing the above result. 

3 Algorithms 

We now describe how to apply the quasiconvex programming framework de- 
scribed above in order to solve our optimal Mobius transformation problems. In 
each case, we form a set of nested convex families Ki, where each family corre- 
sponds to a function fi describing the size of one of the objects in the problem 
(e.g., the transformed radius of a sphere) as a function of the viewpoint loca- 
tion. The solution to the resulting quasiconvex program then gives the viewpoint 
maximizing the minimum of these function values. The only remaining question 
in applying this technique is to show that, for each of the problems we study, 
the functions of interest do indeed have convex level sets. 
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3.1 Maximizing the Minimum Sphere 

We begin with the simplest of the problems described in the introduction: finding 
a Mobius transformation that takes the unit ball to itself and maximizes the 
minimum radius among a set of transformed spheres. Equivalently, we are given 
a set of spheres in a hyperbolic space, and wish to choose a viewpoint for a 
Poincare model of the space that maximizes the minimum Euclidean radius of 
the spheres in the model. By symmetry, the radius of a sphere in the Poincare 
model depends only on its hyperbolic radius, and on the hyperbolic distance from 
its center to the viewpoint. Thus, in this case, the level sets of the transformed 
radius are just concentric hyperbolic spheres. 

Theorem 1. Suppose we are given as input a set of n spheres, all contained in 
the unit ball in E"^. Then we can find the Mobius transformation of the unit ball 
that maximizes the minimum radius of the transformed spheres, in 0{n) time, 
by quasiconvex programming. 

If we view the unit sphere itself as being one of the input spheres, then 
Theorem 1 can be viewed as minimizing the ratio between the radii of the largest 
and smallest transformed spheres. 

Open Problem 1. Is there an efficient algorithm for finding a Mobius trans- 
formation of¥J^ minimizing the ratio between the radii of the largest and smallest 
transformed spheres, when the input does not necessarily include one sphere that 
contains all the others? 

We can also prove a similar result for radius optimization on the sphere: 

Theorem 2. Suppose we are given as input a set of n spheres in Then we 
can find the Mobius transformation of that maximizes the minimum radius 
of the transformed spheres, in 0{n) time, by quasiconvex programming. 

3.2 Vertex Separation 

We next consider problems of using Mobius transformations to separate a col- 
lection of points. 

Theorem 3. Suppose we are given as input a graph with n vertices and m edges, 
with each vertex assigned to a point on the sphere or the unit disk in and 
with edges represented as great circle arcs or straight line segments respectively. 
Then we can find the Mobius transformation that maximizes the minimum length 
of the transformed graph edges, in 0{m) time, by quasiconvex programming. 

Similarly, we can find the Mobius transformation maximizing the minimum 
distance among a set of n transformed points by applying Theorem 3 to the 
complete graph Kn. However, the input size in this case is n, while the algorithm 
of Theorem 3 takes time proportional to the number of edges in Kn, 0{n^). With 
care we can reduce the time to near-linear: 
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Fig. 2. A planar graph (left) and its coin graph representation (right). 

Theorem 4. Suppose we are given n points in or the unit disk in Then 
we can find the Mobius transformation that maximizes the minimum distance 
among the transformed points in 0(nlogn) time. 

Proof. The Delaunay triangulation of the points can be computed in O(nlogn) 
time, is Mobius-invariant (due to its definition in terms of empty circles), forms 
a planar graph with 0{n) edges, and is guaranteed to contain the shortest trans- 
formed distance among the points. Therefore, applying Theorem 3 to the De- 
launay triangulation gives the desired result. □ 

In higher dimensions, the Delaunay triangulation may be complete, and so 
gives us no advantage. However we can again reduce the time from quadratic 
by using a random sampling scheme similar to one from our work on inverse 
parametric optimization problems [9]. 

Theorem 5. Suppose we are given n points in E>‘^. Then we can find the Mobius 
transformation that maximizes the minimum distance among the transformed 
points in randomized expected time O(nlogn). 

Open Problem 2. Is there an efficient deterministic algorithm for maximizing 
the minimum distance among n points inS‘^, d>3? 

4 Applications 

4.1 Spherical Graph Drawing 

As is by now well known, any planar graph can be represented by a set of disjoint 
circles in such that two vertices are adjacent exactly when the corresponding 
two circles are tangent [6,15,21]. We call such a representation a coin graph] 
Figure 2 shows an example. Although it seems difficult to represent the positions 
of the coins exactly, fast algorithms for computing numerical approximations to 
their positions are known [7,18,22]. By polar projection, we can transform any 
coin graph representation on the sphere to one in the plane or vice versa. See Ken 
Stephenson’s web site http://www.math.utk.edu/~kens/ for more information, 
including software for constructing coin graph representations and a bibliography 
of circle packing papers. 
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It is natural to ask for the planar or spherical coin graph representation 
in which all circles are most nearly the same size. It is NP-hard to determine 
whether a planar coin graph representation exists in which all circles are equal, 
or in which the ratio between the maximum and minimum radius satisfies a given 
bound [5]. However, if the graph is maximal planar, its coin graph representation 
is unique up to Mobius transformation, and we can apply Theorem 2 to find 
the optimal spherical coin graph representation. There is also a natural way of 
obtaining a canonical coin graph representation from a non-maximal embedded 
planar graph: add a new vertex in each face, connected to all the vertices of 
the face. Find the coin graph representation of the augmented graph, and delete 
the circles representing the added vertices. Again, Theorem 2 can then find the 
optimal Mobius transformation of the resulting coin graph. 

Due to the fact that a quasiconvex program only has a single global optimum, 
the transformed coin graph will display any symmetries present in the initial 
graph embedding. That is, any homeomorphism of the sphere that transforms 
the initial embedded graph into itself becomes simply a rotation or reflection of 
the sphere in the optimal embedding. If the graph has a unique embedding then 
any isomorphism of the graph becomes a rotation or reflection. For instance, 
the coin graph representation in Figure 2 (right) has the full symmetry of the 
underlying graph, while the planar drawing on the left of the figure does not 
show the symmetries that switch the vertices in the inner and outer squares. 

Alternatively, by representing each vertex by the center of its circle, a coin 
graph representation can be used to And a straight-line drawing of a planar 
graph, or a drawing on the sphere in which the edges appear as non-crossing 
great-circle arcs. The algorithms in Section 3.2 can then be used to And a repre- 
sentation maximizing the minimum vertex separation, among all Mobius trans- 
formations of the initial vertex positions. 

4.2 Hyperbolic Browser 

There has been quite a bit of recent work in the information visualization com- 
munity on hyperbolic browsers, techniques for using hyperbolic space to aid in 
the visualization of large graphs or graph-like structures such as the World- 
Wide Web [20]. In these techniques, a graph is arranged within a hyperbolic 
plane [16] or three-dimensional hyperbolic space [19], and viewed using the Klein 
or Poincare models, or rendered as it would be seen by a viewer within the hy- 
perbolic space. The main advantage of hyperbolic browsers is that they provide 
a “fish-eye” view [10] that allows both the details of the current focus of interest 
and the overall structure of the graph to be viewed simultaneously. In addition, 
the homogeneous and isotropic geometry of hyperbolic space allows for natural 
and smooth navigation from one view to another. 

Although there are many interesting problems in graph layout for hyperbolic 
spaces, we are interested in a simpler question: where should one place one’s 
initial focus, in order to make the overall graph structure as clear as possible? 
Previous work has handled this problem by the simplistic approach of laying out 
the graph using a rooted subtree such as a breadth-first or depth-first tree, and 
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placing the focus at the root of the tree. This approach will work well if the tree 
is balanced, but otherwise deeper parts of the tree may be given a much more 
crowded initial view. Instead, one could use our techniques to find a viewpoint 
that shows the whole graph as clearly as possible. 

We assume we are given a graph, with vertices placed in a hyperbolic plane. 
As in [16], we further assume that each vertex has a circular display region 
where information related to that vertex is displayed; different nodes may have 
display regions of different sizes. It is straightforward to apply Theorem 1 to 
these display regions; the result is a focus placement for the Poincare model 
that maximizes the minimum size of any display region. We can similarly find 
the Klein model maximizing the minimum diameter or width of a transformed 
circle, since the level sets of these functions are again simply concentric balls. By 
applying Theorem 3 we can instead choose the focus for a Poincare model that 
maximizes the minimum distance between vertices, either among pairs from the 
given graph or among all possible pairs. 

Open Problem 3. Is there an efficient algorithm for choosing a Klein model of 
a hyperbolically embedded graph that maximizes the minimum Euclidean distance 
between adjacent vertices? 

We can apply similar methods to find a focus in 3-dimensional hyperbolic 
space that maximizes the minimum solid angle subtended by any display region. 

Open Problem 4. Does there exist an efficient algorithm to find a viewpoint 
in 3-dimensional hyperbolic space maximizing the minimum angle separating any 
pair among n given points? 

Even the 2-dimensional Euclidean version of this maxmin angle separation 
problem is interesting [17]: the level sets are nonconvex unions of two disks, so 
our quasiconvex programming techniques do not seem to apply. 



4.3 Conformal Mesh Generation 

One of the standard methods of two-dimensional structured mesh genera- 
tion [3, 24] is based on conformal mapping (that is, an angle-preserving homeo- 
morphism) . The idea is to find a conformal map from the domain to be meshed 
into some simpler shape such as a disk, use some predefined template to form a 
mesh on the disk, and invert the map to lift the mesh back to the original domain. 
Conformal meshes have significant advantages: the orthogonality of the grid lines 
means that one can avoid certain additional terms in the definition of the par- 
tial differential equation to be solved [24]. Nevertheless, despite much work on 
algorithms for finding conformal maps [8, 12, 22, 23, 25] conformal methods are 
often avoided in favor of quasi-conformal mesh generation techniques that allow 
some distortion of angles, but provide greater control of node placement [3,24]. 

Mobius transformations are conformal, and any two conformal maps from a 
simply connected domain to a disk can be related to each other via a Mobius 
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transformation. However, different conformal maps will lead to different struc- 
tured meshes: the points of the domain mapped on or near the center of the disk 
will generally be included in mesh elements with the finest level of detail, and 
points near the boundary will be in coarser mesh elements. Therefore, as we now 
describe, we can use our optimal Mobius transformation algorithms to find the 
conformal mesh that best fits the desired level of detail at different parts of the 
domain, reducing the number of mesh elements created and providing some of 
the node placement control needed to use conformal meshing effectively. 

We formalize the problem by assuming an input domain in which certain 
interior points pi are marked with a desired element size s, . If we find a conformal 
map / from the domain to a disk, the gradient of / maps the marked element 
sizes to desired sizes s' in the transformed disk: s' = ||/'(pi)ll- We can then 
choose a structured mesh with element size min^ s' in the disk, and transform 
it back to a mesh of the original domain. The goal is to choose our conformal 
map in a way that maximizes min^ s', so that we can use a structured mesh with 
as few elements as possible. Another way of interpreting this is that s' can be 
seen as the radius of a small disk at f{pi). What we want is the viewpoint that 
maximizes the minimum of these radii. 

By applying a single conformal map, found using one of the aforementioned 
techniques, we can assume without loss of generality that the input domain is 
itself a disk. Since the conformal maps from disks to disks are just the Mobius 
transformations, our task is then to find the Mobius transformation maximizing 
mini Si- Since s' depends only on the hyperbolic distance of the viewpoint from 
Pi, the level sets for this problem are themselves disks, so we can solve this 
problem by the same quasiconvex programming techniques as before. Indeed, 
we can view this problem as a limiting case of Theorem 1 for infinitesimally 
small (but unequal) sphere radii. 

Open Problem 5. Because of our use of a conformal map to a low aspect 
ratio shape (the unit disk), rotation around the viewpoint does not significantly 
affect element size. Howell [12] describes methods for computing conformal maps 
to high-aspect-ratio shapes such as rectangles. Can one efficiently compute the 
optimal choice of conformal map to a high- aspect-ratio rectangle to maximize 
the minimum desired element size? What if the rectangle aspect ratio can also 
he chosen by the optimization algorithm? 

4.4 Brain Flat Mapping 

In order to visualize and understand the complicated strucure of the brain, 
neuroanatomists have sought methods for stretching its convoluted surface folds 
onto a flat plane. Hurdal et al. [13] proposed a principled way of performing this 
stretching via conformal maps: since the surfaces of major brain components 
such as the cerebellum are topologically disks, one can conformally map these 
surfaces onto a Euclidean unit disk, sphere, or hyperbolic plane. Hurdal et al. 
approximate this conformal map by using a fine triangular mesh to represent 
the brain surface, and forming a coin graph representation of this mesh. 
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Necessarily, any flat mapping of a curved surface such as the brain’s involves 
some distortions of area, but the distortions produced by conformal mapping can 
be severe; thus, it would be of interest to choose the mapping in such a way that 
the distortion is minimized. As we already noted in section 4.3, the remaining 
degrees of freedom in choosing a conformal mapping can be described by a single 
Mobius transformation. Thus, we need to formulate a measure of distortion, and 
And the transformation optimizing that measure. 

Since we want to measure area, and the mapping constructed by the method 
of Hurdal et al. is performed on triangles of a mesh, the most natural quality 
measure for this purpose seems to be in terms of those triangles: we want to 
minimize the maximum ratio aj a! where a is the area of a triangle in the initial 
three-dimensional map, and a! is the area of its image in the flat map. Unfor- 
tunately, we have not yet been able to extend our techniques to this quality 
measure. A positive answer to the following question would allow us to apply 
our quasiconvex programming algorithms: 

Open Problem 6. Lei T he a triangle in the unit disk or sphere, and let C be 
the set of viewpoints for Mobius transformations that transform T into a triangle 
of area at least A. Is C necessarily convex? 

Instead of attempting to optimize the area of the triangles, it seems simpler 
(although perhaps less accurate) to optimize the area of the disks in the coin 
graph. Under the assumption that the initial triangular mesh has elements of 
roughly uniform size, it would be desirable that the coin graph representation 
similarly uses disks of as uniform size as possible. This can be achieved by our 
linear-time algorithms by applying Theorem 1 in the unit disk, or Theorem 2 
in the sphere. In case the triangular mesh is nonuniform, it may be appropriate 
to apply a weighted version of these theorems, where the weight of each disk is 
computed from the lengths of edges incident to the corresponding mesh vertex. 



5 Conclusions 

We have identified several applications in information visualization and struc- 
tured mesh generation for which it is of interest to And a Mobius transformation 
that optimizes an objective function, typically defined as the minimum size of a 
collection of geometric objects. Further, we have shown that these problems can 
be solved either by local optimization techniques, or by linear-time quasiconvex 
programming algorithms. For the problems where the input to the quasiconvex 
program is itself superlinear in size (maximizing the minimum distance between 
transformed points) we have described Delaunay triangulation and random sam- 
pling techniques for solving the problems in near-linear time. 

We have listed open problems arising in our investigations throughout the 
paper. There is also an important problem in the practical application of our 
algorithms: although there should be little difficulty implementing local opti- 
mization techniques for our problems, the linear-time quasiconvex programming 
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algorithms are based on two primitives that (while constant time by general prin- 
ciples) have not been specified in sufficient detail for an implementation, one to 
test a new constraint against a given basis and the other to find the changed 
basis of a set formed by adding a new constraint to a basis. If the basis repre- 
sentation includes the value of the objective function, testing a new constraint 
is simply a matter of evaluating the corresponding object size and comparing it 
to the previous value. However, the less frequent basis change operations require 
a more detailed examination of the detailed structure of each problem, which 
we have not carried out. For an example of the difficulty of this step see [11]. 
In practice it may be appropriate to combine the two approaches, using local 
optimization techniques to find a numerical approximation to the basis change 
operations needed for the quasiconvex programming algorithms. Especially in 
the coin graph application, the input to the quasiconvex program is already it- 
self a numerical approximation, so this further level of approximation should 
not cause additional problems, but one would need to verify that a quasiconvex 
programming algorithm can behave robustly with approximate primitives. 

More generally, there are very few computational geometry algorithms in- 
volving hyperbolic geometry (a notable exception being [4]) although many 
Euclidean constructions such as the Delaunay triangulation or hyperplane ar- 
rangements can be translated to the hyperbolic case without difficulty using 
the Poincare or Klein models. We expect many other interesting problems and 
algorithms to be discovered in this area. 
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Abstract. We prove approximation guarantees for randomized algo- 
rithms for packing and covering integer programs expressed in certain 
normal forms. The bounds are in terms of the pseudo-dimension of the 
matrix of the coefficients of the constraints and the value of the opti- 
mal solution; they are independent of the number of constraints and the 
number of variables. The algorithms take time polynomial in the length 
of the representation of the integer program and the value of the optimal 
solution. We establish a related result for a class we call the mixed cover- 
ing integer programs, which contains the covering integer programs. We 
describe applications of these techniques and results to a generalization 
of Dominating Set motivated by distributed hie sharing applications, to 
an optimization problem motivated by an analysis of boosting, and to a 
generalization of matching in hypergraphs. 



1 Introduction 

Raghavan and Thompson PTTHTI introduced randomized rounding. Roughly, 
their idea was to construct algorithms for integer programming problems as 
follows: 

— solve a similar problem without the integrality constraint, and 

— round each variable up or down, using its fractional part as the probability 
of rounding up. 

This technique provides strong approximation guarantees for polynomial-time 
algorithms for a class of problems called covering and packing integer programs 
|[Hug8i yirih9| . In a covering integer program, for an m x n matrix A and column 
vectors c and b, all with only nonnegative entries, the goal is to find x G Z" 
to minimize c^x subject to Ax > 6. In a packing integer program, it is also 
assumed that A, c and b have only nonnegative entries, but the goal is to choose 
X G Z" to maximize c^x subject to Ax < b. Raghavan asked whether 

one could exploit algebraic properties of A such as its rank to obtain stronger 
approximation guarantees. 
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In this paper, we report on work along these lines. One can assume with- 
out loss of generality that covering and packing integer programs satisfy A € 
[0, c G [l,oo)" and b = (1,1,...,!)^ (see |Sri99) and Sections 01 and El 

of this paper). For covering and packing integer programs that are expressed 
this way, we prove approximation guarantees for efficient algorithms that are 
independent of the number of variables and constraints, and are in terms of 
the pseudo-dimension |Pol84IHau92j of A. We also establish a similar result for 
mixed covering integer programs, which are like covering integer programs but 
without the requirement that the components of A are nonnegative. 

The pseudo-dimension can be defined as follows Say that a,n m x k 

matrix is full if the origin in can be translated so that the rows of the matrix 
occupy all 2^ orthants. The pseudo-dimension of A is the size of the largest set 
of columns of A such that the matrix obtained by deleting all other columns is 
full. A more formulaic definition is given in Section El 

Since the pseudo-dimension of A is at most its rank [II )iid78IPol84j , our anal- 
ysis implies results like those envisaged by Raghavan. Sometimes, however, the 
pseudo-dimension of a matrix is much smaller than its rank. For example, the 
pseudo-dimension of any identity matrix is 1. 

Our general results are as follows. All of our algorithms are randomized and, 
with probability 1/2, achieve the claimed approximations in time polynomial in 
the number of bits needed to write A and c and the value of the optimal solution. 
(For many commonly studied combinatorial optimization problems, the value of 
the optimal solution is bounded by a polynomial in the size of the input Wrm 
ICK| .l The algorithm for covering integer programs outputs a solution whose 
value is 0(opt(l -I- drlog(ropt))), where d is the pseudo-dimension of A and r 
is the value of the largest entry in A; since r < 1, the value of the algorithm’s 
solution is also O(doptlogopt). For mixed covering integer programs, the bound 
is 0(opt(l -I- dr^opt)); here we cannot assume that r < 1. For packing integer 

programs, our algorithm obtains a solution whose value is 12 (2ro°ptV'’’^ ) ’ ^ 

constant k > 0 (here once again r < 1). 

We illustrate the application of our general result about covering integer 
programs using the R-domination problem [INT}95ISri99j . a generalization of 
Dominating Set motivated by distributed file sharing applications. In the B- 
domination problem, the goal is to locate as few facilities as possible at the 
nodes of a network so that each node of the network has at least B facilities 
within one hop. We give a randomized algorithm that, for graphs of constant 

genus, with probability 1/2, outputs a solution of size O ^opt ^1 -I- ^ ^ in 

polynomial time. 

Our study of mixed covering integer programs was inspired by a learning 
problem, which can be abstracted as the minimum majority problem as follows: 
given an m X n matrix A with entries in {—1, 1}, choose x G Z" to minimize 
Sr=i subject to Ax > 0. Our algorithm for mixed covering integer programs 
yields a bound of O(dopt^) for this problem. We derive our motivation for this 
problem from an analysis of the generalization ability of hypotheses output by 
boosting algorithms |SFIjL98) : details are given in Section 
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Our general results about packing integer programs can be applied to simple 
i?-matching Here, given a family S of subsets of a finite set X, the 

goal is to output as many of the sets in S as possible while ensuring that each 
element of X is included in at most B of the chosen sets. We give a random- 
ized polynomial-time algorithm for this problem that outputs a solution of size 
f2((opt/B)i-'=‘'/^), where d is the VC-dimension of the dual of the input and 
/c > 0 is an absolute constant. 

Our work builds on that of Bronnimann and Goodrich and Pach and 

Agarwal IPA95I . who established approximation guarantees for polynomial-time 
algorithms for Set Cover in terms of the VC-dimension of the dual of the input 
set system. Our analysis of covering integer programs is a generalization of the 
analysis of Pach and Agarwal. Set Cover can be formulated as a covering integer 
program, and the pseudo-dimension of the resulting coefficient matrix is the 
same as the VC-dimension of the dual of the input set system. 

Srinivasan |Sri m showed that if a fractional solution is rounded as originally 
proposed by Raghavan and Thompson, then the events that the constraints are 
violated are positively correlated, and used this to improve the analysis of ran- 
domized rounding for packing and covering integer programs. Recently, he pro- 
vided RNC and NC algorithms with the same approximation guarantees 
However, his approximation bounds still depend on m. 

Baker [Ba,k94j described a polynomial-time approximation scheme for Dom- 
inating Set when the input is restricted to be planar. 

It is not hard to se e how to use boosting |SchH()IPref)5| . together with 
Lemma 3.3 of fHMP+9.3) . to design an algorithm for the minimum majority prob- 
lem that outputs a solution with value 0(opt^ log m). Since d < log m, our bound 
is never more than a constant factor worse than this, but when d << logm, it 
is significantly better. 

For simple H-matching, the only bounds we know are in terms of opt and 
|A|; the best is H ^ ^ ISriHhl . When d « B « opt << \X\ 

and d « log |A| (note again that d < log|A|), our bound improves on this 
significantly. 

2 Preliminaries 

Denote the nonnegative rationals by Q+, and the nonnegative integers by Z+. 

For a countable set X, a probability distribution D over X, and a predicate 
(p over X, denote by Pr„;^jj{ip{x)) the probability that (p{x) is true when x is 
chosen according to D. Define Exgd similarly. Denote by the distribution on 
X^ obtained by sampling £ times independently according to D. 

For a domain X, and a subset S' of A, define xs to be the indicator function 
for S, i.e. function from X to {0, 1} for which xs{x) = 1 a; G S. 

For a domain X, say that a set T of real-valued functions defined on X 
shatters a sequence Xi,...,Xd of elements of X if there is a sequence ri,...,rd of 
real thresholds such that for any b\,...,bd G {above, below}, there is an / G A 
such that for all i G {!,..., dj, f{xi) > ri bt = above. Define the pseudo- 
dimension of !F, denoted by Pdim(A), to be the length of the longest 



Using the Pseudo-Dimension to Analyze Approximation Algorithms 



29 



sequence shattered by T . The VC-dimension mm of a set T of functions from 
X to {0, 1}, denoted by VCdim(^), is its pseudo-dimension. The VC-dimension 
of a family S of subsets of X is the VC-dimension of {xs : S S 5}. 

For a real matrix A, define the pseudo-dimension of A, denoted by Pdim(A), 
by thinking of the rows of A as functions and taking the pseudo-dimension of 
the resulting class of functions. Specifically, if A is an m x n matrix, for each 
i S {1, define fA,i ■ {1, ...,n} — >■ R by fA,i{j) = Aij and let Pdim(A) = 

Pdim({/A,i : i e {1, ..., m}}). 

For a family S of sets define the dual of S, denoted by dual(5) as follows. 
For each x G UsgsS', let Qx,s = {S G S : x € S}. Let dual(5) = {Qx,s ■ x G 
CseS'S'}- 



Lemma 1 ( |[Vap82tPoT^ L There is a constant k > 0 such that for any r > 0, 
any finite set X, any set T of functions from X to [0,r], any e > 0, and any 
probability distribution D over X, if £ > lilLdhlTQ. in | , then 

^3/ G X,ExeD{fix)) > e but < ef/2^ < 1/4. 



Lemma 2 ( IITal94l i . There is a constant k > 0 
with a < b (let r = b — a), any finite set X, any 

[a,b\, any e > 0 , and any probability distribution 
then 



such that for any real a and b 
set T of functions from X to 
D over X, if i > 






(^3/ G T, 



E.6c(/(x)) 







< 1/4, 



Lemma 3. There is a constant k > 0 such that for any r > 0, any finite set 
X, any set T of functions from X to [0,r], any e > 0, and any probability 
distribution D over X, and for any a > 1, if £> In 7 , then 



Pr(zi...., 2 U 6 -D^ \^f e ^xeoifix)) < e ^ /(z,) > (1 + a)elj < 1/4. 

We are not aware of a reference for Lemma 0 Its proof, whose rough outline 
follows those of related results (see jPol84IHau92HAB93IAB99] L is omitted due 
to space contraints. 

3 Covering Integer Programs 

In a covering integer program, for natural numbers n and m, column vectors 
c G Q" and b G Q+, and a matrix A G the goal is to choose x G Zft to 

minimize <F x subject to Ax > b. 
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Srinivasan 1?^ showed that one can assume without loss of generality that 
A G [0, and b G [1, oo)"*. By dividing each row i of ^ by bi, one can further 
assume w.l.o.g. that each component of b is 1. Furthermore, one can assume that 
each component of c is positive, since if some Cj = 0, one can eliminate the jth 
variable by deleting all constraints that can be satisfied by making it arbitrarily 
large. Finally, we can scale c so that its least component is 1. This is summarized 
in the following. 

Definition 1. A covering integer program in normal form is given by a matrix 
A = [0,1]"*^" and a column vector c G [l,oo)". The goal is to find a column 
vector X G 7A such that x > (0,0, ...,0)^ and Ax > (1,1,...,!)^ in order to 
minimize (Fx. 

Theorem 1. There is a polynomial q and a randomized algorithm R with the 
following property. For any covering integer program (T, c) in normal form, if 
r = maxi j Aij and L is the number of bits required to write A and c, then with 
probability 1/2, Algorithm R outputs a feasible solution x in q(L,opt{A, c)) time 
whose solution has cost that is 0(opt(A, c)(l + rPdim(A) log(ropt(A, c)))). 

Proof Sketch: For the sake of brevity, we will consider an algorithm (let’s call 
it R') that makes use of the knowledge of Pdim(yl). It is not hard to see how to 
remove the need for this knowledge. Algorithm R' is as follows. 

— Solve the linear program obtained by relaxing the integrality constraint. Call 
the solution u. 

— Set Z = X)r=i P ~ u/Z. Note that p can be interpreted as a proba- 

bility distribution on {l,...,n}. Note also that Z > 1/r, since otherwise all 
constraints would be violated. 

— Let K be as in Lemma Hand £ = max{|’2KPdim(A)rZln(rZ)], |"2Z]}. Sample 
i times at random independently according to p, and, for each /, let x^ be 
the number of times that j occurs. 

— Output X = (a;i, ..., x„). 

Choose an input (A, c) and let r = max^^ opt = opt(A,c), and d = 
Pdim(A). Let be the rows of A. Since Au > (1,1,...,!)^, we have 

Ap > {l/Z){l, 1, ..., 1)^. Thus, for each i, we have Ejgp(Aij) = Ejgp(/^,i(j)) > 

Since, for each i, incrementing Xj has the effect of increasing • x by Aij = 
fA,i(j), applying Lemma^with e = 1/Z, with probability at least 3/4, for all i, 
Ui ■ X > i/ (2Z) > 1. Thus, 



Pr(x is not feasible ) < 1/4. 



( 1 ) 



We have 



E(c^x) = £c^p < toptjZ < 



max{[2KdrZln(rZ)], [2Z’]}opt 
Z 
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Thus, Markov’s inequality implies that 

4max{ \2Kd{rZ) In(rZ)], r2Z]}opt ^ ^ 

Since each Ci > 1, we have Z = — Sr=i Combining with 

m and 0 completes the proof. □ 




4 Mixed Covering Integer Programs 

In a mixed covering integer program, for natural numbers m and n, column 
vectors c G Q" and b G Q™, and a matrix A G Q™^”, the goal is to choose 
X G Z™ to minimize (P" x subject to Ax > b. Note that to be a mixed covering 
integer program, the entries of A need not be nonnegative. If 6 = (1, 1, ..., I)^ 
and c G [1, oo)", then we say that the mixed covering integer program is in 
normal form. (This can be seen to be without loss of generality as with covering 
integer programs.) Note however, that here we cannot assume without loss of 
generality that the entries of A are at most 1. 

Theorem 2. There is a polynomial q and a randomized algorithm R with the fol- 
lowing property. For any mixed covering integer program {A, c) in normal form, 
if r = maxij \Aij\ and L is the number of bits required to write A and c, then 
with probability 1/2, Algorithm R outputs a feasible solution x in q{L, opt(A, c)) 
time whose solution has cost that is 0(Pdim(A)r^opt(A, c)^ -I- opt(4, c)). 

Proof Sketch: As in the proof of Theorem [I] we will consider an algorithm R' 
that “knows” Pdim(A); the algorithm is the same as in that proof, except k is 
defined as in Lemma|21 and i = max{ [4KPdim(A)r^Z^] , |"2Z]}. We will borrow 
notation from that proof. 

As before, since Au > (1,1,...,!)^ and p = ujZ, for each i G {l,...,m}, 
EjGp(/A.*(j)) > 1/^- Thus 



Pr(a: is not feasible) < ^3i,E(/A,d > IjZ but ^/A.*(jt) < 

^ > 1/^ but ^/A.*(jt) < 

1^3*, E(/A,d-^E/A.0't) > 






which is at most 1/4 by Lemma 0 But 

rp max{ r4KPdim(A)r^Z^l , r2Zl jopt 

E(c^a:) < iopt/Z < bJ bJ LLiL. 

Zi 

Applying Markov’s inequality and the fact that Z < opt as in the proof of 
Theorem n completes the proof. □ 



32 



P.M. Long 



5 Packing Integer Programs 

In a packing integer program, for natural numbers n and m, column vectors 
c S Q" and b G Q™, and a matrix A G the goal is to choose x G Z™ to 

maximize c^x subject to Ax < b. 

Arguing as for covering, one can assume without loss of generality that entries 
of A are in [0, 1] and b = (1, 1, ..., 1)^. Furthermore, one can also assume in this 
case that each component of c is positive; here if some Cj =0, you might as well 
set Xj = 0, and thus, the jth variable can be eliminated. Since again we can 
scale c so that its least component is 1, we arrive at the following. 

Definition 2. A packing integer program in normal form is given by a matrix 
A — [0, and a column vector c G [l,oo)”. The goal is to find a column 
vector a; G Z" such that x > (0,0, ...,0)^ and Ax < (1,1,...,!)^ in order to 
maximize c^x. 



Theorem 3. There is a constant k > 0, a randomized polynomial-time al- 
gorithm R and a polynomial q with the following property. For any packing 
integer program (A, c) in normal form, if B is the least integer such that 
ma,yiij Aij < 1/B, L is the number of bits in the representation of A and c, 
and d = Pdim(A), with probability 1/2, Algorithm R outputs a feasible solution 



Proof Sketch: The fact that the entries of A are at most 1/B implies that any 
X with X)i=i Xi < B is feasible. This, together with the fact that each component 
of c is at least 1, implies that it is trivial to find a solution of value B. Hence, we 
can assume without loss of generality that (^pt > B and therefore, since 

opt > B, that kd/B < 1. 

Again, we will consider an algorithm R' that “knows” Pdim(A): 

— Solve the linear program obtained by relaxing the integrality constraint. Call 
the solution u. 

— Set Z = X^r=i and p = u/Z. (Note that Z > B; otherwise, since the 
entries of A are at most 1/B, no constraints would be binding, and u could 
be improved.) 



— Let K be as in Lemma 0 d = Pdim(A), a = (Z/B)^^^/^, I = 



p, and, for each j, let Xj be the number of times that j occurs. 

— Output X = (a;i, ..., Xn)- 

Choose an input {A, c) and let oi, ..., am be the rows of A. Let B, d, a and 
£ be as in the description of Algorithm R! , and let opt = opt (A, c). 

Suppose £ = 1. Again, since the entries in A are at most 1, and the constraints 
are of the form ai ■ x < 1, then since in this case X)i=i Xi = 1, x is certainly 
feasible. 



X in < 7 (L, opt(A, c)) time whose solution has value that is fl 




K,d{Z/B) \n{Z/B) 
ol ln(l+a) 



. Sample £ times at random independently according to 
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Suppose £ > 1. Since Au < (1, 1, 1)^, for each i, we have E(oi • x) < IjZ. 
Applying part (c) of Lemma 0 (note that since Z > B, a > 1), with probability 
at least 3/4, for all i, 



Qi ■ X < {1 + a)llZ 



1 -I- a Kd(Z ! B)\n(Z ! B) 
Z aln(l-|-Q!) 



AndiZlB) ln{Z/B) 
Zh\{l + a) 



< 1 



where the second inequality holds because a > 1 and £ > 1. 
Thus, whatever the value of £, we have 



Pr(a; is not feasible ) < 1/4. (3) 

Applying Chebyshev’s inequality yields 



f T £<^u ! £c^u\ . , , 

Pricx<— 2y^-j<l/4. (4) 

Substituting the value of £ and simplifying, we have 

£(Fu c^u c^u opt 

since c^u > opt and kd/B < 1. Putting this together with ( 0 ) and ( 0 ) completes 
the proof. □ 



6 Applications 

In this section, we give examples of the application of our general results. 



6.1 Dominating Set and Extensions 

The B-domination problem [IN RhhISrihhl is defined as follows: given a graph 
G = (V, E), place as few facilities as possible on the vertices of G in such a way 
that each vertex has at least B facilities in its neighborhood. The neighborhood 
of a vertex is defined to consist of the vertex and all vertices sharing an edge 
with it. 

Define M{G) to be the set system consisting of all the neighborhoods in G, 
i.e. 

M{G) = {{w : {u,w} £ E}U {u} : v G V}. 

Theorem 4. For each natural number B, there is a polynomial-time algorithm 
A for the B-domination problem such that for any graph G with optimal solution 
opt{G,B), algorithm A outputs a solution of size 
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Proof: If Xy is the number of facilities located at vertex u, the problem is to 
minimize 'Yhv^v subject to the constraints, one for each vertex v, that 





/B > 1 



(and that the XyS are nonnegative integers). Since M{G) = dual(A/'(G)), and 
scaling all members of a set of functions by a common constant factor does not 
change its pseudo-dimension, applying Theorem Q completes the proof. □ 

The following is an example of how this can be applied. Recall that the genus 
of a graph is, informally, the number of “handles” that need to be added to the 
plane before the graph can be embedded without any edge crossings. 



Theorem 5. Choose a fixed nonnegative integer k. 

For each natural number B, there is a polynomial-time algorithm A for the 
B-domination problem such that for any graph G of genus at most k with optimal 
solution opt(G, R), algorithm A outputs a solution of size 



O I opt(G, B) ( 1 -|- 



log opt(G,R) 
B 



Proof Sketch: We bound the VC-dimension of A/”(G) in terms of the genus of 
G and apply Theorem El Details are omitted from this abstract. □ 



6.2 Sparse Majorities of Weak Hypotheses 

The minimum majority problem is to, given an m x n matrix A with entries in 
{— 1, 1}, choose X S Z" to minimize X^r=i subject to Ax > 0. In other words, 
choose as short a sequence ji, ...,jk of columns as possible such that for each row 
i, a majority of ..., are 1. The following is an immediate consequence 
of Theorem El 

Theorem 6. There is a randomized polynomial time algorithm for the min- 
imum majority problem that, with probability 1/2, outputs a solution of cost 
0(opt^Pdim(R)). 

The minimum majority problem is a restatement of an optimization problem 
motivated by learning applications. Many learning problems can be modeled as 
that of approximating a {0, 1}- valued function using examples of its behavior 
when applied to randomly drawn elements of its domain IVals4IHa,uh2l; the ap- 
proximation is sometimes called a hypothesis. Boosting jSch90IFre95IFS97) is a 
method for combining “weak hypotheses” , which are correct on only a slight 
majority of the input examples, into a “strong hypothesis”, which outputs a 
weighted majority vote of the weak hypotheses. The key idea of the most in- 
fluential analysis of the ability of the strong hypothesis to generalize to unseen 
domain elements |SKHI^98j is to use the fact that it can be approximated by a 
majority of a few of the weak hypotheses. This suggests an alternative approach 
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to the design of a learning algorithm: try directly to find hypotheses that ex- 
plain the data well using majorities of as few as possible of a collection of weak 
hypotheses. This is captured by the minimum majority problem: the columns 
correspond to examples, the rows to weak hypotheses, and an entry indicates 
whether a given weak hypothesis is correct on a given example. The goal is to 
find a small multiset of weak hypotheses whose majority is correct on all of a 
collection of examples. This direct optimization might provide improved gener- 
alization, but even if not, its output should be easier to interpret, which is an 
important goal for some applications IMDI . 

6.3 Simple B-Matching 

The problem of simple B-matching |Lov75j is to, given a family S of subsets of 
a finite set X, find a large T Q S such that each element of X is contained in 
at most B of the sets in T. 

Theorem 7. There is a constant k such that for all integers B >1 and d > 2, 
there is a polynomial time algorithm for the simple B matching problem that, 
for any input S such that VCdim(dual(5)) < d, outputs a solution of size 
f2((opt(5)/B)i-'=‘^/-S). 

Proof: Consider the variant of the simple B matching problem in which multiple 
copies of sets in S can be included in the output. This problem can be expressed 
as a packing integer program in normal form as follows. For each S G S, include 
a variable xs indicating the number of copies of S in the output. Then the goal 
is to maximize subject to the constraints, one for each x G X, that 

^s^s-.x^s^s) jB < 1 . 

Suppose opt (5) is the optimal value of the objective function for the original 
simple B-matching problem. Since the optimal value of the objective function 
for the multiple-copy variant is at least opt(5), and since, once again, scaling 
elements of a set of functions by a common constant factor does not affect its 
pseudo-dimension. Theorem 0 implies that the value of the solution output by 
the algorithm described above is 17 ( ’ Certainly no more than B 
copies of any set are included, so if we output one copy of all sets for which 
> 0 we get a solution of size 17 ^(opt(iS)/B)^~*‘^^'®^ . □ 

7 Concluding Remark 

Other generalizations of the VC-dimension to real and integer- valued functions 
have been proposed, and results similar to Lemmas 0 and El proved for them 
(see [Uud78IJNat89|Vap89pKS94lfjCHL92IAfjCH97IBLW96lfjL98j L It is easy to 
see how to prove analogues of Theorems 0 El and 0 for any of these. In some 
cases, these may provide easier analyses or stronger guarantees. 
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Abstract. Many real-time embedded systems involve a collection of in- 
dependently executing event-driven code blocks, having hard real-time 
constraints. Portions of such codes when triggered by external events 
require to be executed within a given deadline from the triggering time. 
The feasibility analysis problem for such a real-time system asks whether 
it is possible to schedule all such blocks of code so that all the associ- 
ated deadlines are met even in the worst case triggering sequence. Each 
such conditional real-time code block can be naturally represented by 
a directed acyclic graph whose vertices correspond to portions of code 
having a straight-line flow of control and are associated with execution 
requirements and deadlines relative to their triggering times, and the 
edges represent conditional branches. Till now, no complexity results 
were known for the feasibility analysis problem in this model, and all ex- 
isting algorithms in the real-time systems literature have an exponential 
complexity. In this paper we show that this problem is NP-hard under 
both dynamic and static priorities in the preemptive uniprocessor case, 
even for a set of only two task graphs. For dynamic-priority feasibility 
analysis we give a pseudo-polynomial time exact algorithm and a fully 
polynomial-time approximation scheme for approximate feasibility test- 
ing. For the special case where all the execution requirements of the 
vertices are identical, we present a polynomial time exact algorithm. For 
static-priority feasibility analysis, we introduce a new sufficient condi- 
tion and give a pseudo-polynomial time algorithm for checking it. This 
algorithm gives tighter results for feasibility analysis compared to those 
known so far. 



1 Introduction 

Over the years there have been several efforts to correctly model real-time sys- 
tems and answer scheduling-theoretic questions arising in these models. All of 
the resulting models are based on an abstract framework in which a real-time 
system is modelled as a collection of independent tasks. Each task generates a 
sequence of jobs, each of which is characterized by a ready-time, an exeeution 
requirement, and a deadline. Hard-real-time systems require that for each job 
generated by a task, an amount of processor time equal to the job’s execution 
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requirement be assigned to it between its ready-time and its deadline. The fea- 
sibility analysis of such a hard-real-time task set is concerned with determining 
whether it is possible to schedule all the jobs generated by the tasks, such that 
they meet their deadlines under all possible circumstances. 

In the context of most real-time embedded systems, each such real-time task 
is required to model an event-driven block of code, parts of which are triggered 
by external events and require to be executed within a given deadline from the 
triggering time. A natural representation of such a task is a directed acyclic 
graph whose vertices represent portions of code having a straight-line flow of 
control, and the edges represent possible conditional branches. The vertices are 
triggered by external events and have to be executed within their associated 
deadlines. The feasibility analysis of such a set of task graphs answers whether 
it is possible to schedule all the graphs so that all the associated deadlines are 
met even in the worst case triggering sequence. The difficulty of such an analysis 
lies in the fact that what constitutes a worst case triggering sequence for an 
individual graph can not be determined in isolation, due to the presence of the 
conditional branches. To illustrate this, consider the following example taken 
from | 2 |. 

while {external event) do 

execute code block Bq {having execution time eg and deadline dg} 

if (C) then 

execute code block Bi {execution time ei, deadline di} 

else 

execute code block B2 {execution time 62, deadline ^2} 

end if 
end while 

In the above block of code, if the condition C depends on some external 
event, or on the value of a variable which can not be determined at compile 
time, then the worst case branch here would depend on the other blocks of code 
executing concurrently with this one. Let e\ = 2, di = 2, 62 = 4 and ^2 = 5. 
If another code block is simultaneously executing with e = 1 and d = 1 then 
the (ei,di) branch corresponds to the worst case, whereas if e = 2 and d = 5 
then the (62,^2) branch corresponds to the worst case. Hence the usual method 
followed for the feasibility analysis of hard-real-time systems, of approximating 
a piece of code by its worst case behaviour does not work in the presence of 
conditional branches. 

In this paper we consider the feasibility analysis of a collection of such code 
blocks with conditional branches and real-time constraints, and present a series 
of results on the complexity of various versions of this problem. 



1.1 The Model 

A task modelling a block of code is represented by a directed acyclic graph 
with a unique source and a unique sink vertex. Associated with each vertex v 
is its execution requirement e{v) (which can be determined at compile time). 
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and deadline d{v). Whenever the vertex v is triggered, the code corresponding 
to it has to be executed (which takes e(w) amount of time) within the next d{v) 
time units. Each directed edge (u, v) in the graph is associated with a minimum 
intertriggering separation p(u,v), denoting the minimum amount of time that 
must elapse before the vertex v can be triggered after the triggering of the vertex 
u. This can be used to model a possible communication delay between u and v. 

The semantics of the execution of such a task graph state that the source 
vertex can be triggered at any time, and once a vertex u is triggered then the 
next vertex v can be triggered only if there exists a directed edge (u, v) and 
at least p{u, v) amount of time has elapsed since the triggering of u. If there 
are directed edges (u,vi) and (u,V2) from the vertex u then only one among 
vi and V2 can be triggered, after the triggering of u. Therefore, a sequence of 
vertices V\,V2, ■ ■ ■ ,Vk getting triggered at time instants ti,t2, ■ ■ ■ ,tk is legal if 
and only if there are directed edges (vi,Vi+i) and ti+i — ti > p{vi,Vi+i) for 
i = 1 , . . . ,k — 1 . The real-time constraints require that the code corresponding 
to vertex vi be executed within the time interval {ti,ti + d{vi)]. Note that in 
general the condition ti+i > U+d{vi) may not hold, i.e. a vertex can be triggered 
before the deadline of the last triggered vertex has elapsed. A consequence of 
this might be that the code corresponding to a vertex v is executed before 
that corresponding to a vertex u, although there exists a directed edge (u,v). 
Since this might not be allowable in most applications, throughout this paper we 
assume that U+i > ti + d{vi) which is equivalent to requiring that p{u, v) > d{u). 
Most of the previous work is based on this assumption and in the real-time 
systems literature this is referred to as the frame separation property. 

Task sets and feasibility analysis. A task set T = {Ti, T2, . . . , Tj,} consists of 
a collection of task graphs, the vertices of which can get triggered independently 
of each other. A triggering sequence for such a task set T is legal if and only if 
for every task graph Ti, the subset of vertices of the sequence belonging to Ti 
constitutes a legal triggering sequence for Ti. In other words, a legal triggering 
sequence for T is obtained by merging together (ordered by triggering times, 
with ties broken arbitrarily) legal triggering sequences of the constituting tasks. 

The feasibility analysis of a task set T is concerned with determining whether 
for all possible legal triggering sequences of T, the codes corresponding to the 
vertices of the task graphs can be scheduled such that all their associated dead- 
lines are met. In this paper we consider the preemptive uniprocessor version of 
this problem. 

Many scheduling algorithms are implemented by assigning priorities at each 
time instant (according to some criteria), to all jobs that are ready to execute and 
then allocating the processor to the highest priority job. Based on this, scheduling 
algorithms can be broadly classified into either dynamic-priority or static-priority 
(also known as fixed-priority) algorithms. Dynamic-priority algorithms allow the 
switching of priorities between tasks. This means that for two tasks, both having 
ready jobs at two time instants, at one instant the first task’s job might have a 
higher priority than the second task’s job, while at the other instant the priorities 
might switch. Static-priority algorithms, in contrast to this, do not allow such 
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priority switching. Here we will be concerned with both dynamic- and static- 
priority algorithms. 

1.2 Our Results 

The task model considered in this paper, apart from being of independent inter- 
est, forms the core of the recurring real-time task model very recently proposed 
by Baruah in m- This model is especially suited for accurately modelling con- 
ditional real-time code with recurring behaviour, i.e. where code blocks run in 
an infinite loop, and generalizes many of the previous well known models like the 
sporadic |H| , multiframe j0| , generalized multiframe 0| , and recurring branching 
0. All of these previous models can be shown 0 to be special cases of the 
recurring real-time task model. However, the algorithms presented in 0 for the 
feasibility analysis problem in this model for the preemptive uniprocessor case, 
both with dynamic and static priorities, have a running time which is exponen- 
tial in the number of vertices of the task graphs. It was also remarked that the 
feasibility analysis problem for this model is ‘likely to be intractable’, and in con- 
trast to the previous (less general) models, no longer runs in pseudo-polynomial 
time. 

The main contribution of this paper is that it answers all the questions raised 
in |0] and thereby settles the complexity of the feasibility analysis problem for 
scheduling conditional real-time code. For the ease of presentation, the model we 
consider here is slightly simpler than that of 0 in the sense that we do not con- 
sider the recurring behaviour of the task graphs. We postpone the details of how 
the results derived here for this simpler model can be extended to the recurring 
real-time task model, to a full version of this paper. Firstly, we show that the 
feasibility analysis problem, both for dynamic and static priorities, is NP-hard. 
For the dynamic-priority feasibility analysis we give a pseudo-polynomial time 
exact algorithm and a fully polynomial-time approximation scheme for approx- 
imate feasibility testing. We also show that for the special case where all the 
vertices of a task graph have equal execution requirements, this problem can be 
solved in polynomial time. 

For static-priority feasibility analysis Baruah had introduced a sufficient con- 
dition in 0 . We give a tighter condition for sufficiency and show that this can be 
checked in pseudo-polynomial time. Further, our condition is simpler than that 
of 0. Our results imply that for all practical purposes the feasibility analysis 
problem in the recurring real-time task model is efficiently solvable. 

We present the hardness results in Section 0 In Section 0 we present the 
algorithms for dynamic-priority feasibility analysis, followed by those for static- 
priority feasibility analysis in Section 0 Due to space restrictions all proofs are 
omitted here; we refer the interested reader to 0. 

2 NP-Hardness of Feasibility Analysis 

In this section we obtain that both the dynamic- and static-priority feasibility 
analysis problems for our task model, and therefore for the recurring real-time 
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task model as well, are NP-hard for the preemptive uniprocessor case. Our proofs 
rely on a reduction from the knapsack problem which is known to be NP-hard. 

Theorem 1. The dynamic-priority feasibility analysis problem for the task 
model described in Section o in a preemptive uniprocessor environment is NP- 
hard. 

The next result shows that the static-priority feasibility analysis problem is 
NP-hard. The pseudo-polynomial time algorithm that we present later for this 
problem, and also the algorithm presented in [^, are based on testing whether 
a given task from a task set is lowest-priority feasible. A task T G 7' is lowest- 
priority feasible if and only if all the vertices of T can always meet their deadlines 
with T assigned the lowest priority and all the remaining tasks of T having any 
arbitrary priority assignment. The existence of a lowest-priority feasible task in 
any static-priority feasible task set is given later by Theorem El 

The next theorem says that the lowest-priority feasibility testing problem 
is NP-hard, and as a corollary of this it follows that static-priority feasibility 
analysis is also NP-hard. 

Theorem 2. The problem of determining whether a given task is lowest-priority 
feasible is NP-hard. 



Corollary 1. The static-priority feasibility analysis problem is NP-hard. 

3 Dynamic-Priority Feasibility Analysis 

A necessary and sufficient condition for the dynamic-priority feasibility of the 
recurring real-time task model was stated in It was stated without proof 
that the condition follows from the processor demand criterion introduced in 
0. It is possible to give a simple independent proof (7] showing that the same 
condition works for our model. It is based on an abstraction of a task, represented 
by a function called the demand-bound function. The demand-bound function 
of a task T, denoted by T.dbf(t), takes as an argument a real number t and 
returns the maximum possible cumulative execution requirement by vertices 
of T that have been triggered by a legal triggering sequence and have both 
their ready times and deadlines within a time interval of length t. Intuitively, 
T.dbfit) denotes the maximum possible execution requirement that can possibly 
be demanded by T within any time interval of length t, if all its vertices are to 
meet their deadlines. 

Theorem 3. A task set T is dynamic-priority feasible if and only if for all 
t > 0, ^ 

We next show that the problem of computing T.dbfif) for a task T is NP-hard 
and then give a FPTAS for approximating it, which immediately leads to an 
approximate decision algorithm for the feasibility analysis problem. 
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Theorem 4. The problem of computing T.dbfft) is NP-hard. 



Given a task graph T we first give a pseudo-polynomial time algorithm for com- 
puting T.dbfff) for any t > 0, based on dynamic programming. Let there be n 
vertices in T denoted by vi, . . . , Vn, and without any loss of generality we as- 
sume that there can be a directed edge from Vi to Vj only if i < j. Following our 
notation described in Section o associated with each vertex Vi is its execution 
requirement e(vi) which here is assumed to be integral (a pseudo-polynomial al- 
gorithm is meaningful only under this assumption), and its deadline d(vi). Asso- 
ciated with each edge {vi, Vj) is the minimum intertriggering separation p{vi, Vj). 

Let ti^e be the minimum time interval within which the task T can have 
an execution requirement of exactly e time units due to some legal triggering 
sequence, considering only a subset of vertices from the set {ui,...,Ui}, if 
all the triggered vertices are to meet their respective deadlines. Let t\ ^ be 
the minimum time interval within which a sequence of vertices from the set 
{ui, . . . ,Vi}, and ending with the vertex Vi, can have an execution requirement 
of exactly e time units, if all the vertices have to meet their respective deadlines. 
Lastly, let E = maxj=i_..._„ e(ui). Clearly, nE is an upper bound on T.dbfft) 
for any < > 0. It can be trivially shown by induction that Algorithm ^ correctly 
computes T.dbfft), and has a running time of 0{n^E). 



Algorithm 1 Computing T.dbf{t) 



Input: Task graph T, and a real number t >0 
for e <— 1 to nE do 

J d{vi) if e{vi) = e 
\ oo 



ti,, 



otherwise 



ti,e 



tl,, 



end for 

for i 1 to n — 1 do 
for e 1 to nE do 

Let there be directed edges from the vertices Vi ^ , Uij > ■ 



, Uif, to 



t 



i-l-1 

i+l,e 



^i+l,e ^ 

end for 
end for 

T.dbfit) ^ 



“in{iq,e-e(«i+i) “ d{vi-)+p{vi^,Vi+i) + d{vi+i)\j = l,...,k} 
ife(ui+i)<e, d{vi+i) ii e{vi+i) = e, and oo otherwise 
min{ti,e,t):^i,J 



max{e | < t} 



Given this algorithm, any t > 0, and an 0 < e < 1, let Tj be the subgraph 
of T consisting only of those vertices Vi for which d(vi) < t, and let Et denote 
the maximum execution requirement of a vertex from among all vertices of Tt- 
Now we scale all the execution requirements associated with the vertices of Tt by 
K = eEt/n i.e. e' {vi) = \e{vi)/K\ and run the algorithm with the new e'(vi)s 
and the graph Tt . By using the same arguments as in the FPTAS for the knapsack 
problem, it is possible to show that for any t > 0, the algorithm outputs a value 
> (1 — £)T.dbf{t) and runs in time 0(nf /e), and is therefore an FPTAS for com- 
puting T.dbfft). We denote the result computed by this algorithm by T.dbf{t). 
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For our approximate decision algorithm, note that for all t > 0, there can be 
at most n distinct values of Et for any task graph. For each such Et, we consider 
the corresponding subgraph that gives rise to this Et as described above, and 
scale the execution requirements of the vertices of this subgraph hy K = eEt/n. 
In each such subgraph Tt, the number of values of time intervals t' at which the 
value of Tt.dbf'{t') changes is bounded by 0(r? je)^ and hence the number of val- 
ues of time intervals t at which the value of T.d6/'(t) changes is bounded 

by 0{\T\n^ / e) . Our fully polynomial time approximate decision algorithm for 
dynamic-priority feasibility analysis is now given as Algorithm 0 

Algorithm 2 Approximate decision algorithm for feasibility analysis 
Input: Task set T and a real 0 < e < 1 
decision YES 

for all values of t at which T.dbf' (t) changes for any T gE do 
if T.dbf' (t) > t then {Condition (*)} 

decision ■4— NO 
end if 
end for 
retnrn decision 

Theorem 5. If a task set T is infeasible then Algorithmic always returns the 
correct answer. If T is feasible and t > EreT o,ll values of 

t, then the algorithm always returns the correct answer YES, otherwise it might 
return a NO. YES answers are always correct. The running time of the algorithm 
is 0(|Tpn®e“^ log n), if all task graphs have 0{n) vertices. 

For each task T, computing the tn,e values for each of its subgraphs Tt, using 
Algorithm 0 and the scaled execution requirements requires 0{n^/s) time, and 
these values are stored in a table. Hence computing all such values for all the task 
graphs in T takes 0{n^\T\/e) time. For each value of t for which EtgT 
changes, computing T.dbf (t) for any T G E requires a binary search to identify 
the appropriate table corresponding to a subgraph Tt, and then a linear search 
through the table. Therefore, computing the value of 'f2j,^.j-T.dbf'{t) for any 
value of t takes 0(|T|n^£“^ logn) time. Hence the total running time of Algo- 
rithm El is 0(|Tpn®£“^ log n). The algorithm is overly pessimistic in the sense 
that for certain feasible task sets it might return a NO. However, for task sets 
which can be in some sense comfortably scheduled even in the worst case, leav- 
ing some idle processor time (which can be parameterized by e), the algorithm 
always returns a YES. Therefore, any e characterizes a class of task sets for 
which the algorithm errs. Decreasing e reduces this class of such task sets for 
which the algorithm errs, at the cost of increasing the running time quadratically 
in 1/e, thereby giving a fully polynomial-time approximate decision scheme for 
approximate feasibility testing. 

It may be noted that changing Condition (*) in Algorithm El to 

if 'ff.TeTT-dbf {t) > t then 
decision 4— NO 

end if 
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will result in an overly optimistic algorithm which might incorrectly return a YES 
for certain classes of infeasible task sets. For all feasible task sets it always returns 
YES, and NO answers are always correct. The task sets for which the algorithm 
might err are those in which the cumulative execution requirement by tasks of Y 
within any time interval of length t exceeds the maximum execution requirement 
that can be feasibly scheduled, by an amount of less than e time 

units. Again, decreasing e: reduces the class of such task sets, at the cost of the 
running time increasing linearly in 1/e. 

Lastly, it might be noted that Theorem 0 along with Algorithm Q] also imply 
a pseudo-polynomial time algorithm for dynamic-priority feasibility analysis. 
To see this, let for any task T G T, ^max denote the maximum amount of 
time elapsed among all execution sequences starting from the source vertex of 
T and ending at the sink vertex, if every vertex is triggered at the earliest 
possible time (respecting the minimum intertriggering separations). Let tmax = 
maxT^Ttmax- It follows from Theorem 0 that T is dynamic-priority feasible 
if and only if T.d6/(t) < t for all t = 1, . . . ,tmax- T.dbf{t) for any t 

can be determined in pseudo-polynomial time by Algorithm 0 and clearly, tmax 
is pseudo-polynomially bounded, implying a pseudo-polynomial algorithm for 
dynamic-priority feasibility analysis. 

3.1 Vertices with Equal Execution Requirements 

We now show that for the special case where for every task T belonging to a 
task set Y, all the vertices of T have equal execution requirements, the feasibility 
analysis problem for 7” can be solved in polynomial time. This result holds even 
when all execution requirements and deadlines take values over the reals. 

We denote the vertices of a task graph T by , . . . , and assume that there 
can be a directed edge from Vi to Vj only if i < j . Let ti^k denote the minimum 
time interval within which exactly k vertices of T from the set {ui, . . . , Ui} (ob- 
viously k < i) need to be executed as a result of some legal triggering sequence, 
if they have to meet their associated deadlines. Let t] denote the minimum 
time interval within which exactly k vertices of T consisting of vi and any other 
k — 1 vertices from {ui, . . . , Vi-i} need to be executed as a result of some legal 
triggering sequence, if they have to meet their associated deadlines. 

Given any vertex Vi of T, let there be directed edges from the vertices 
Vi^, . . . , Vi^ to Vi- Then for any k < i, 

t\ f. = min{t*^ - d{vi^) +p{vi^,Vi) + d{vi) | j = 1, . . . ,^} (and d{vi) if /c = 1) 

Using the fact that tij = t\ ^ = d{vi), it is now possible to compute any ti^k 
within at most 0{n^) time, where n is the number of vertices in the task graph. 
Now, if each task graph T G T has tit vertices then let us consider the set 
^ = UtgT Ui=i graph T}. If each vertex of task graph T has 

an execution requirement of e, then for ant t > 0, T.dbf{t) = max{ie | ^ ^}- 

Clearly, the task set T is feasible if and only if r.d6/(t) < t for all t G S. 



46 



S. Chakraborty, T. Erlebach, and L. Thiele 



Computing all the necessary dbf values for each task graph and storing them 
in a table takes O(n^) time if the number of vertices in any task graph is 0(n). 
Since there are |T| task graphs, this whole process takes 0{\T\n^) time. For 
each value of t, verifying whether the sum of the dhfs exceeds t requires a search 
through the previously computed tables and takes 0(|T| logn) time. Since there 
are 0(|T|n) values of t for which this has to be verified, this takes 0(|Tpnlogn) 
time. Hence the total run time is bounded by 0{\T\n^ + \T\^nlogn). 

4 Static-Priority Feasibility Analysis 

The static-priority feasibility analysis of a task set T is concerned with deter- 
mining whether there exists an assignment of priorities to the tasks of T under 
which they can be scheduled by a static-priority run time scheduler so that all 
deadlines are met even in the worst case triggering sequence. Any such priority 
assignment is defined to be a good static-priority assignment for T. As mentioned 
in Section El solving this feasibility analysis problem is based on testing whether 
a given task T G T is lowest-priority feasible. Clearly, if there is a procedure for 
testing lowest-priority feasibility, and the task set T is static-priority feasible, 
then |T| calls to this procedure will be sufficient to identify a lowest-priority fea- 
sible task of T. Therefore, if |T| = n then with 0{n?) calls to this procedure a 
good static-priority assignment for T can be determined based on the following 
theorem. 

Theorem 6 (Audsley, Tindell, Burns pj). Suppose a task T G T is lowest- 
priority feasible. Then there is a good static-priority assignment for T if and 
only if there is a good static-priority assignment for 7^{T}. 

An algorithm for static-priority feasibility analysis therefore reduces to devising 
an algorithm for lowest-priority feasibility testing. An algorithm implementing a 
sufficient condition for lowest-priority feasibility was given by Baruah in |5j for 
the recurring real-time task model. It is also based on an abstraction of a task, 
similar to the demand-bound function presented in Section 01 and uses a func- 
tion called the request-bound function. The request-bound function of a task T, 
denoted by T.rbfft), takes as an argument a real number t and returns the max- 
imum possible cumulative execution requirement by vertices of T that have been 
triggered according to some legal triggering sequence and have their ready times 
within any time interval of length t. Intuitively, T.rbfit) is an upper bound on 
the maximum amount of time, within any time interval of length t, for which T 
can deny the processor to all lower-priority tasks. Based on this function, the fol- 
lowing sufficiency condition was given for lowest-priority feasibility testing in |2| . 

Theorem 7 (Baruah [2]). A taskT gT is lowest-priority feasible if\/t : 3T < 
t such that t' — J^T'eTMT} T' .rbfft') > T.dbfff). 

For any task T G T, in our task model described in Section o let Cax be 
as described in Section 01 Clearly, T is lowest-priority feasible if the condition 
given by Theorem 0 is satisfied for all values of t = 1, . . . Amax- Although tmax 
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is pseudo-polynomially bounded by the representation of T, the algorithm for 
computing T.rbfit) for any t and T C T as given in | 2 | runs in time which is 
exponential in the number of vertices of T. 

We first obtain that the problem of computing T.rbf{t) is NP-hard and then 
give a modified request-bound function, which we denote by T.rbf'(t), and give a 
pseudo-polynomial time algorithm for computing it for any value of t > 0 based 
on dynamic programming. Using rbf'(t) we then give a new sufficiency condition 
for testing lowest-priority feasibility. This gives a tighter test compared to that of 
Theorem [Tlin the following sense: for any task set T, if a task T G T is returned 
as lowest-priority feasible by the test in Theorem 0 then it is also returned as 
lowest-priority feasible by our test, and there exist task sets T and tasks T G T 
which although being lowest-priority feasible, fail the test in Theorem 0 but are 
returned as lowest-priority feasible by our test. Lastly, we show that for any task 
set consisting of exactly two tasks, our test is both a necessary and sufficient 
condition. 

Theorem 8. The problem of computing T.rbf(t) is NP-hard. 

Our new T.rbf'ff) is similar to T.rbf{t) and returns the maximum possible 
cumulative execution requirement by vertices of T within any time interval of 
length t, that have been triggered by a legal triggering sequence. To illustrate 
the difference between the two functions, consider a task graph T consisting of a 
single vertex having an execution requirement of 5 and any arbitrary deadline. 
Whereas T.rbfft) = 5 for any t > 0 (since the ready time of T is at time 0), 
T.rbf'{t) = t for t < 5 and is equal to 5 for any t > 5. 

Following the notation used in Section^ given a task graph T, let ti g denote 
the minimum time interval within which T can have an execution requirement 
of exactly e time units due to some legal triggering sequence, considering only 
a subset of vertices from the set {vi, . . . ,Vi}. Let t\ ^ be the minimum time 
interval within which any execution sequence consisting of vertices from the set 
{ui, . . . , Vi-i\ and ending with the vertex Vi can have an execution requirement 
of exactly e time units. Now recall the definition of t\ ^ as used in Section 0 
for computing T.dbf(t), which is the minimum time interval within which a 
sequence of vertices from the set {vi, . . . ,Vi}, and ending with the vertex vt 
can have an execution requirement of exactly e time units, if all the vertices 
have to meet their respective deadlines. This we denote here by d6/*(e). We 
assume, as in Section 0 that T consists of n vertices vi, . . . ,Vn and that there 
can be a directed edge from Vi to Vj only if i < j, and that all the execution 
requirements are integral. If if = maxi=i^..._„ e(ui), then Algorithm 0 correctly 
computes T.rbf'ft) and has a running time of 0(n^E^). Our new sufficiency 
condition for lowest-priority feasibility is based on the following lemma. 

Lemma 1. Let T G T and the task graph corresponding to T have n vertices 
vi, . . . ,Vn- If each of these vertices Vi is lowest-priority feasible in the task set 
7^{T}U{'Ci}; then T is also lowest-priority feasible. 



48 



S. Chakraborty, T. Erlebach, and L. Thiele 



Algorithm 3 Computing T.rbf'{t) 

Input: Task graph T, and a real number t > 0 
for e •(— 1 to nE do 

f e if e < e(vi) 

^ I oo if e > e{vi) 

ti,e t— tl,e 

end for 
Computing 

Let there be directed edges from the vertices . . . , to Ui+i 

Let dbfl^. (e - e{vi+i) + 1) - d{vi^) +p{vi^,Vi+i) + e{vi+i) - I 

Let ^ min{t“"+i+/(0 | Z = 0, . . . , e{vi+i) - 1} 

ii+i,e •«- min{ti,e,Z'+i_e} 

T.rbf'{t) ■«— max{e | Zn,e < t} 



Theorem 9. A taskT is lowest-priority feasible if for all vertices v belong- 
ing to the task graph ofT, 30 < i < d{v) for which t — X)t'gT\{t} T'.rbf'(t) > 
e{v). 

It is easy to see that if a task T G T is returned as lowest-priority feasible by the 
test given by Theorem Q then it also passes the test of Theorem 0 Additionally, 
if T is returned as lowest-priority feasible, then it is really so. To show that this 
represents a tigher test, consider a task set consisting of two task graphs Ti and 
T 2 . Ti is a simple chain of three vertices with the first two vertices having their 
execution requirements equal to 1 and deadlines equal to 2, and the third vertex 
having an execution requirement of 3 and deadline equal to 6. The intertriggering 
separation on any directed edge (u, v) is equal to the deadline of u. T 2 consists 
of a single vertex having an execution requirement of 1 and deadline equal to 4. 
It can be seen that T 2 is indeed lowest-priority feasible and passes the test of 
Theorem 0 but fails the test given by Theorem 0 Lastly, we show that for any 
set of exactly two task graphs, the test given by Theorem 0is both a necessary 
and sufficient condition. 

Theorem 10. For any task set T consisting of exactly two task graphs, a task 
T G T is lowest-priority feasible if and only if it satisfies the test given by 
Theorem [3 

It now follows from Theorem 0 Algo rithm|3J and Theorem 0 that there exists 
a pseudo-polynomial algorithm for static-priority feasibility analysis that imple- 
ments the sufficiency condition stated by Theorem|3 Further, the same approach 
of scaling the execution requirements associated with the vertices as described 
in Section 0 for the dynamic-priority feasibility analysis, and then using Algo- 
rithm 0 with the scaled values will give a polynomial time approximate decision 
algorithm for static-priority feasibility analysis. Lastly, in the case where for each 
task graph all the vertices have equal execution times, this problem can also be 
solved in polynomial time. We skip the details of any of these in this paper. 
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5 Concluding Remarks 

This paper settles the complexity of the feasibility analysis problem involved 
in scheduling a collection of code blocks with conditional branches and real- 
time constraints. In particular it shows that although the feasibility analysis of 
the recently introduced recurring real-time task model is NP-hard, there exists a 
pseudo-polynomial time exact algorithm and a fully polynomial-time approxima- 
tion scheme for solving it. All the results presented here pertain to the preemptive 
uniprocessor version of this problem. It would be natural to extend these results 
to the non-preemptive and different multiprocessor cases. Following [3| , our algo- 
rithms were based on an abstraction of a task represented by the demand-bound 
and the request-bound functions, which in some sense captured the worst case 
behaviour of a task. It seems unlikely that this same approach might work for 
any non-preemptive or multiprocessor case, except for very restricted classes of 
tasks such as those where all vertices have unit execution requirements and time 
is integral. In more general cases, the worst case triggering sequence of the ver- 
tices of a task graph as identified by the demand- or request-bound functions 
need not be the worst case when the issue of feasibly packing these jobs (on 
multiple processors, for example) is taken into account. 
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Abstract. We develop external data structures for storing points in one 
or two dimensions, each moving along a linear trajectory, so that a range 
query at a given time tq can be answered efficiently. The novel feature 
of our data structures is that the number of I/Os required to answer a 
query depends not only on the size of the data set and on the number 
of points in the answer but also on the difference between tq and the 
current time; queries close to the current time are answered fast, while 
queries that are far away in the future or in the past may take more time. 



1 Introduction 

1/ 0-communication, and not internal memory computation time, is often the 
bottleneck in a computation when working with datasets larger than the avail- 
able main memory. Recently, external geometric data structures have received 
considerable attention because massive geometric datasets arise naturally in 
many applications (see fTTTlr] and references therein). The need for storing and 
processing continuously moving data arises in a wide range of applications, in- 
cluding air-trafhc control, digital battlefields, and mobile communication sys- 
tems. Most of the existing database systems, which assume that the data is 
constant unless it is explicitly modified, are not suitable for representing, stor- 
ing, and querying continuously moving objects because either the database has 
to be continuously updated or a query output will be obsolete. A better ap- 
proach would be to represent the position of a moving object as a function f{t) 
of time, so that the position changes without any explicit change in the database 
system and so that the database needs to be updated only when the function 
f{t) changes (e.g., when the velocity of the object changes). 
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In this paper, we focus on developing efficient external data structures for 
storing a set of moving points in one- or two-dimensional space so that range 
queries over their (future or past) locations can be answered quickly. Our focus 
is on what we call time responsive data structures that have fast response time 
for near-future or near-past queries but may take more time for queries that are 
far away in time. Time responsiveness is important in, e.g., air-trafhc control, 
where queries in the near future are more critical than queries far away in the 
future. 



1.1 Problem Statement 

Let S = {pi,P2, ■ ■ ■ ,Pn} be a set of moving points in d = 1, 2. For any time 
t, let pi{t) denote the position of Pi at time t, and let S{t) = {pi(t), . . . ,PAr(t)}- 
We will assume that each point pi is moving along a straight line at some fixed 
speed, that is, pi{t) = ai-t + hi for some a^, We will use tnow to denote 
the current time. We are interested in answering queries of the following form: 

Ql. Given a set S of points moving along the y-axis, a j/-range R = [yi,y 2 ], 
and a time tq, report all points of S that lie inside R at time tq, that is, 
S{tq)nR. 

Q2. Given a set S of points moving in the a;y-plane, an axis-aligned rectangle 
R, and a time tq, report all points of S that lie inside R at time tq, that is, 
S{tq)r\R. 

As our main interest is minimizing the number of disk accesses needed to 
answer query, we will consider the problem in the standard external memory 
model; see e.g. j5]. This model assumes that each disk access transmits a con- 
tiguous block of B units of data in a single input/output operation (or I/O). The 
efficiency of a data structure is measured in terms of the amount of disk space it 
uses (in units of disk blocks) and the number of I/Os required to answer a range 
query. As we are interested in solutions that are output sensitive, our query I/O 
bounds are not only expressed in terms of N, the number of points in S, but 
also in terms of K, the number of points reported by the query. The minimum 
number of disk blocks we need to store N points is \N/B~\, and at least \K/B~\ 
I/Os are needed to report K output points. We refer to these bounds as “linear.” 



1.2 Previous Results 

Recently, there has been a flurry of activity in computational geometry and 
databases on problems dealing with moving objects. In the computational geom- 
etry community, Basch et al. 0 introduced the notion of kinetic data structures. 
Their work led to several interesting internal memory results related to moving 
points; see P2I and references therein. The main idea in the kinetic framework 
is that even though the points move continuously, the relevant combinatorial 
structure of the data structure change only at certain discrete times. Therefore 
the data structure does not need to be updated continuously. Instead kinetic up- 
dates are performed on the data structure only when certain kinetic events occur. 
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These events have a natural interpretation in terms of the underlying structure. 
In contrast, in fixed-time-step methods, where the structure is updated at fixed 
time intervals, the fastest moving object determines an update time step for the 
entire data structure. Even though the kinetic framework often leads to very 
query efficient structures, one disadvantage of kinetic data structures is that 
queries can only be answered at the current time (i.e., in chronological order). 
Thus while kinetic data structures are very useful in simulation applications they 
are less suited for answering the types of queries we consider in this paper. 

In the database community, a number of practical methods have been pro- 
posed for handling moving objects (see |2()f1 tij and the references therein). Al- 
most all of them require Q{N/B) I/Os in the worst case to answer a QI or 
Q2 query — even if the query output size is 0(1). Kollios et al. proposed 
the first provably efficient data structure, based on partition trees |2ECS|, for 
queries of type Ql. The structure uses 0{N/B) disk blocks and answers queries 
in 0{{N / + K/B) I/Os, for any e > 0. Agarwal et al. ^ extended the 
result to Q2 queries. Kollios et al. H2| also present a scheme that answers 
a Ql query using optimal 0{logQ N + K/B) I/Os but using 0{N‘^/B) disk 
blocks. These data structures are time- oblivious, that is, they do not evolve 
over time. Agarwal et al. were the first to consider kinetic data structure 
in external memory. Based on external range trees 0, they developed a data 
structure that answers a Q2 query in optimal 0{logg N K/B) I/Os using 
0{{N/B) logs -^/(logs logs ^)) disk blocks — provided, as discussed above, that 
the queries arrive in chronological order. The amortized cost of a kinetic event 
is 0(logs N) I/Os, and the total number of events is 0{N'^). They also showed 
how to modify the structure in order to obtain a tradeoff between the query 
bound and the number of kinetic events. 

Agarwal et al. were also the first to propose time responsive data struc- 
tures in the context of moving points. They developed an 0{N/B) space struc- 
ture for Ql queries and a 0{{N/B) log^ A^/(log^ log^ N)) space structure for Q2 
queries, where the cost of a query at time tq is a monotonically increasing func- 
tion of the difference between tnow and tq. The query bound never exceeds 
0{{N / K / B) . They were only able to prove more specific bounds when 
the positions and velocities of the points are uniformly distributed inside a unit 
square. 



1.3 Our Results 

In this paper we combine the ideas utilized in time-oblivious and kinetic data 
structures in order to develop the first time responsive external data structures 
for Ql and Q2 queries with provably efficient specific query bounds depending on 
the difference between tnow and tq. Our structures evolve over time, but unlike 
previous kinetic structures, queries can be answered at any time in the future. 
Our data structures are of a combinatorial nature and we therefore measure time 
in terms of kinetic events. We define (f{t) to be the number of kinetic events that 
occur (or have occurred) between tnow and t, and our query bounds will depend 
on f{tq), the number of kinetic events between the current time and tq. For 
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brevity, we will focus on queries in the future, i.e., tq > tnow The somewhat 
simpler case of queries in the past {tq < tnow) can be handled similarly. 

In Section 0 we describe a time responsive data structure for Q1 queries. 
A kinetic occurs when two points pass through each other (become equal) and 
their relative orderings change. Our data structure uses 0{{N/ B)logg N) disk 
blocks and answers a Q1 query with time stamp tq such that (fi{tq) < 
in 0{B'^~^ + logg N + K/B) I/Os. The expected amortized cost of a kinetic 
event is 0{log% N) I/Os. Note that a query at a time tq with <p{tq) < NB can 
be answered in optimal 0(log3 N + K/B) I/Os. Previously such query efficient 
structures either used 0{N^/B) space [I3| or required the queries to arrive in 
chronological order Q. Our data structure is considerably simpler than the one 
proposed in 0 and does not make any assumptions on the distribution of the 
trajectories in order to prove a bound on the query time. 

In Section 0 we describe a time responsive data structure for Q2 queries. 
A kinetic event now occurs when the x- or y-coordinates of two points become 
equal. This data structure uses 0{{N/B)logg N) disk blocks and answers a 
Q2 query with time stamp tq such that <p{tq) < NB’’ in 0{yjN/B’ ■ {B’~^ + 
log^ A^) + K/B) I/Os. Each kinetic event is handled in 0{log% N) expected 
I/Os. If ip{tq) < NB the query is answered in 0{y^N/Blogg N + K/B) I/Os, 
an improvement over previous 0{{N/ B)^^^^’’ + K/B) I/O structures for non- 
chronological queries. 



2 Preliminaries 

Arrangements. Given a set S of N lines in K^, the arrangement A(S') is 
the planar subdivision whose vertices are the intersection points of lines, edges 
are the maximal portions of lines not containing any vertex, and faces are the 
maximal connected portions of the plane not containing any line of S. A(<5') has 
0{N^) vertices, edges, and faces. For each I < k < N, the k-level Ak{S) of 
A(S') is defined as the closure of all edges in A{S) that have exactly k lines of 
S (strictly) below them. The fc-level is a polygonal chain that is monotone with 
respect to the horizontal axis. Dey PH showed that the maximum number of 
vertices on the fc-level in an arrangement of N lines in the plane is 0{Nkf/^). 
Recently, Toth proved a lower bound of I2(fV2^*°s*^) on the complexity of a 
fc-level E3 Using a result by Edelsbrunner and Welzl El, Agarwal et al. ^ 
discussed how a given level of an arrangement of lines can be computed I/O- 
efficiently. 

Lemma 1 (Agarwal et al. [2j). A given level with T vertices in an arrange- 
ment of N lines can be computed in 0{N \og 2 N Tlog 2 Alog^ N) I/Os. 

B-trees. A B-tree, one of the most fundamental external data structures, stores 
N elements from an ordered domain using 0{N/B) disk blocks so that a one- 
dimensional range query can be answered in OifoggN -\- K/B) I/Os 
element can be inserted/deleted in Oifogg N) I/Os. A standard B-tree answers 
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queries only on set of elements currently in the structure. A persistent (or multi- 
version) B-tree, on the other hand, supports range queries in all states {versions) 
of the data structure in 0{logg N+K/ B) I/Os [ISI18j . Updates can be performed 
in 0(log^ N) I/Os on the current structure, and we refer to the structure existing 
after T updates as the structure existing at time T. 

Lemma 2 aw)- A persistent B-tree constructed by performing N updates 
using 0{log^ N) I/Os each, uses 0{N/B) disk blocks and supports range queries 
at any time in 0(log^ N -\- K/ B) I/Os. 

3 Data Structure for Moving Points in M 

In this section we consider queries of type QI. If we interpret time as the t-axis 
in the parametric ty-plane, each point in S traces out a line in this ty-plane. 
Abusing the notation a little, we will use S' to denote the resulting set of At 
lines. A Ql query then corresponds to reporting all lines of S that intersect a 
vertical segment (Figure C] (i).) 





Fig. 1. (i) Lines in tj/-plane traced by S. A Ql query corresponds to finding the fines 
intersected by segment [(tq,j/i), (tq,y 2 )]- (ii) Two windows for N = 6 and B = 2. 



Since a vertical line £ : t = a induces a total order on the lines in S — namely 
the relative ordering of the points in S{a) — and since this ordering does not 
change until two points pass through each other, we can design an efficient data 
structure using a persistent B-tree as follows: We sweep A(S') from left to right 
(— oo to oo) with a vertical line, inserting a segment from A(S') in a persistent 
B-tree when its left endpoint is encountered and deleting it again when its right 
endpoint is encountered. We can then answer a Ql query with range R = [yi, 1 / 2 ] 
by performing a range query with R at time tq. Since the arrangement is of size 
0{N^), Lemma 0implies that this data structure answers queries in the optimal 
0{logg N K/B) I/Os using 0{N‘^/B) spaceQ In order to improve the size 

^ Strictly speaking, Lemma 0 as described in |H] does not hold for line segments since 
not all segments are above/below-comparable. However, in 0 it is discussed how to 
extend the result to line segments. 
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to 0{{N / B)\og^ N), while at the same time obtaining a time responsive data 
structure, we divide the ty-plane into 0(log^ N) vertical slabs (or windows) and 
store a modified linear-space version of the above data structure in each window. 



3.1 Overall Structure 

Let ti,... ,ty,v < (^),be the sorted sequence of the t-coordinates of the vertices 
in the arrangement .A(5'). For 1 < i < [logg N~\, set n = tArsq i-C., NB^ events 
occur before Ti. We define the first window Wi to be the vertical slab [— oo, ti] xR, 
and the Rh window Wi, 2 < i < [logg N~\, to be the vertical slab [ri_i,Ti] x R. 
Figure 0(ii) shows an example of an arrangement of six lines with two windows. 

Our data structure consists of a B-tree B on ri, T 2 , . . . , as well as a window 
structure WIi for each window Wi . In Section below we first describe the 
algorithm for constructing the windows, and in Section id.di we then describe the 
data structure WIi that uses 0{N/B) disk blocks and answers a query with 
tq G [Ti_i, Ti] in 0{B^~^ + log^ N -I- K/B) I/Os. To answer a QI query we first 
use T to determine in 0(log^ N) I/Os the window Wi containing tq, and then we 
search WIi with R to report all K points of Rr\S(t) in 0{B^~^ +logg N+K/B) 
I/Os. 



3.2 Computing Windows 

We describe an algorithm that computes the 0{logg N) t-coordinates ti, T 2 , . . . 
using 0((iV/i3) log 2 iVlog^ TV) I/Os. We cannot afford to compute all vertices 
of the arrangement .A(S') and then choose the desired t-coordinates. Instead we 
present an algorithm that, for any A: G N, computes the fcth leftmost vertex of 
.4(5') I/O-efficiently. To obtain the Ti’s we run this algorithm with k = NB"^ for 
1 < * < [logsiV]. 

To find the kth left most vertex we first choose N random vertices of .4(5) 
and sort them by their t-coordinates. As shown in PH. the merge-sort algorithm 
can be modified to compute the N vertices without explicitly computing all 
vertices of A{S). By using external merge-sort 0 we use 0{{N/B) log^ N) I/Os 
to compute the N vertices, and we can sort them in another 0{{N/B) log^ N) 
I/Osfl Let pi,p 2 , ■ ■ ■ ,pn be the sequence of vertices in the sorted order, and let 
Wi be the vertical slab defined by vertical lines through pi_i and Pi. A standard 
probabilistic argument, omitted from this abstract, shows the following. 

Lemma 3. With probability at least 1 — 1/A^, each slab Wi contains 0{N) ver- 
tices ofA{S). 

Suppose we have a procedure CouNT(Wi) that counts using 
0{{N/B)logg N) I/Os the number of vertices of A(5) lying inside a ver- 
tical slab Wi - By performing a binary search and using Count at each step 

^ To obtain this bound we assume that the internal memory is capable of holding B 
blocks. The more general bound obtained when the internal memory is of size M 
will be given in the full paper. 
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of the search, we can compute in 0((A^/_B) log 2 A^log^ A^) I/Os the index i 
such that the fcth leftmost vertex of .4(5') lies in the vertical strip 14/ • Using 
CouNT(Wi) again we can check whether 14/ contains more than cN vertices of 
.4(5), where c is the constant hidden in the big-Oh notation in Lemma 0 If it 
does, we restart the algorithm. By Lemma 01 the probability of restarting the 
algorithm is at most 1/A^. If 41/ contains at most cN vertices, we find all 
vertices of .4(5) lying inside the strip Wi in 0{{N/B) logg N) I/Os and choose 
the desired vertex, using a simple modification of merge-sort; details will appear 
in the full paper. 

What remains is to describe the Count(4I/) procedure. Note that two lines 
£ and £! intersect inside 41/ if they intersect its left and right boundaries in 
different order, i.e., I lies above £' at the left boundary of W/ but below I at the 
right boundary, or vice-versa. The problem of counting the number of vertices 
of .4(5) inside 44/ reduces to counting the number of pairs of lines in 5 that 
have different relative orderings at the two boundaries of 41/. It is well known 
that the number of such pairs can be counted using a modified version of merge- 
sort. Hence, we can count the number of desired vertices using 0{{N/B) log^ N) 
I/Os by modifying external merge-sort 0|. Putting everything together, we can 
compute the r^’s in a total of 0{{N/B) log 2 IVlog^ N) I/Os. 

3.3 Window Data Structure 

Let Wi = [Ti_i,Ti] X M be a window containing NB^ vertices of .4(5). We 
describe how to preprocess 4l(5) into a data structure WIi that uses 0{N/B) 
disk blocks and answers a Q1 query in 0(5*“^ Tlog^ N + K/B) I/Os. Since Wi 
contains NB"^ vertices of .4(5), a persistent B-tree constructed on 44/ would use 
0{NB^~^) disk blocks. We therefore instead only build the persistent structure 
on N/B"^ carefully selected levels of .4(5) and build separate structures for each 
of the N/B'^ bundles defined by these levels. Our algorithm relies on the following 
simple lemma about randomly chosen levels of the arrangement .4(5). 

Lemma 4. For a given integer ^ G [0, S® — 1], let 4^ = {A I 1 < J < 

N/B"^} be N / B"^ levels of A{S). If the integer f is chosen randomly, then the 
expected number of vertices in all the levels of whose t-coordinates lie inside 
4V^ is 0{N). 

The preprocessing algorithm works as follows. We choose a random integer 
f G [0,H* — 1]. For 1 < / < N/B^, let Xj be the (/H*-l-0-level of .4(5). Set 4^ = 
{Aj I 1 < J < N/B^}. We refer to 4^ as the set of critical levels. We compute 4j 
using Lemma 0 If during the construction, the size of 4^ becomes more than 
2cN, where c is the constant hidden in the big-Oh notation in Lemma E] then 
we abort the construction, choose another random value of and repeat the 
above step. This way we make sure that the critical levels are of size 0{N). By 
LemmaEI the algorithm is aborted 0(1) expected times, so the expected number 
of I/Os needed to construct the critical levels is 0{N log 2 Nlogg N) (Lemma0). 
We store 4j in a persistent B-tree 7/ by sweeping the fy-plane with a vertical 
line from t = Ti_i to so that for a vertical segment R — [yi,y 2 ] and a time 
instance tq G [xi-i, r^], we can compute the critical levels intersected by R. Since 
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there are 0{N) vertices in Ti uses 0{N/B) space and can be constructed in 
0{{N/B)logB N) I/Os. 

For 1 < j < N / B"^, let bundle Bj be the union of the levels that lie between 
Xj-i and Xj, including Xj-i. See Figure El(i). For each bundle Bj, we store the 
set of lines in Bj at any time t G [ri_i,rj] in a separate persistent B-tree Vj. 
More precisely, we assume that every line in S has a unique identifier and keep 
the lines ordered by this identifier in Let T>j{t) denote the version of 2?* 
at time t. We sweep the window Wi from left to right. Initially P® stores the 
lines of S in Bj at time r^-i. A line leaves or enters Bj at a vertex of A_,_i or 
Xj. Therefore we update 2?® and 'B'j+i at each vertex of Xj. Since is of size 
0{N), the total number of I/Os spent in constructing all the bundle structures is 
0{{N/B) log^ N) and the total space used is 0{N/B) blocks. Finally, for every 
vertex v = (ty,yv) on the critical level Xj, we store a pointer to the roots of 
structures 2?®(tj,) and T>jj_i(tv). 




Fig. 2. (i) The N/B'^ critical levels in Wi = x R. Bundle B2 is shaded, (ii) 

Query with i? = [yi, j/2] at time tq. All points in the shaded bundles at time tq and all 
relevant points in the bundles containing the endpoints of R are reported. 



To answer a Q1 query, we first use % to find the critical levels intersecting 
R at time tq in 0{log^ N) I/Os. Next we use the pointers from the vertices 
on the critical levels to find the bundle structures 2i>® of bundles completely 
spanned by R at time tq. We report all points in these structures using 0{K/B) 
I/Os. Finally, we scan all lines in the (at most) two bundles intersected but 
not completely spanned by R using 0{B‘^/B) = 0 ( 23 ®“^) I/Os and report the 
relevant points. In conclusion, we answer the query in 0{B'^~^ +logg N + K/B) 
I/Os; see Figure 0(ii). 



Lemma 5. Let S be a set of N points moving along the y-axis with fixed 
velocities, and let Wi = [ri-i,Ti] x IS. be a window such that A{S) has 
0{NB'^) vertices inside Wi. We can preprocess S into a data structure WIi 
in 0{N \ 0 g 2 N log g N) expected I/Os such that a Q1 query with tq G [ri_i,Ti] 
can be answered in 0{B^~^ + log^ N + K/B) I/Os. The number of disk blocks 
used by the data structure is 0{N/B). 
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This completes the description of the basic data structure, but there is one 
technical difficulty that one has to overcome. Immediately after building our 
data structure, a query in the first window Wi can be answered in the op- 
timal 0{logg N + K/B) I/Os. However, if we do not modify the data struc- 
ture, the query performance of the structure deteriorates as the time elapses. 
For example, for tnow > ^i, a query within the first NB events after tnow re- 
quires il{B -I- logg N + K/B) I/Os. To circumvent this problem, we rebuild 
the entire structure during the interval [tArs/ 2 ,t 3 ArB/ 4 ] as though tnow were 
tsNB/ii and switch to this structure at time This allows us to always 

answer a query at time tq with ip{tq) < NB'' /A in 0{B'~^ + logs K + K/B) 
I/Os. Since we use 0(A^ log 2 iVlog^ N) I/Os to rebuild the structure, we charge 
0((log2 A^log^ N)/B) = 0(log^ N) I/Os to each of the NB/A events to pay for 
the reconstruction cost. Putting everything together, we obtain the following. 

Theorem 1. Let S be a set of N points moving along the y-axis with fixed velo- 
cities. S can be maintained in a data structure using 0(log3 N) expected I/Os 
per kinetic event such that a Q1 query at time tq, with NB'~^ < ip(tq) < NB', 
can be answered in 0{B'~^ log^ N K/B) I/Os. The structure can be built 
in 0{N log 2 A^logg N) expected I/Os and uses 0{{N/B) log^ N) disk blocks. 

Remarks. 

(i) Our window structure Wli (LemmaEI) can easily be modified to work even if 
it is built on a set of line segments instead of lines, provided the lines obtained 
by extending the line segments have 0{NB') intersections in Wi. We simply 
construct the critical levels Ai on the lines but the bundle structures Bj on 
the segments. 

(ii) Our result can be extended to the case in which the trajectory of each point 
is a piecewise-linear function of time, i.e., in the ty-plane. S' is a set of N 
polygonal chains with a total of T vertices. Then the query time remains the 
same and the total size of the data structure is 0{T /B) disk blocks. 

4 Data Structure for Moving Points in 

We now turn to points moving in K.^. Analogously to the liA-case, S{f) traces 
out N lines in xyt-space, and our structure for Q2 queries utilizes the same 
general ideas as our structure for Q1 queries. While a kinetic event in the liA- 
case corresponds to two points passing each other, an event now occurs when 
the X- or y-coordinates of two points coincide. We divide the xyt-space into 
0(logg N) horizontal slices along the t-axis such that slice Si contains 0{NB') 
kinetic events; see Figure 0 (i). That is, we choose a sequence ti < T 2 < . . . of 
log^ N time instances so that 0{NB') events occur in the interval r^]. We 
set Si = X [Ti-i,Ti\. As previously, our data structure consists of a B-tree 
on ti,T 2 , . . . , and a linear-space slice structure SCi for each slice Si. Below, we 
will design a linear space data str ucture SCi, which can be used to answer a Q2 
query with tq G r^] in 0{^N/B' ■ {B'~^ log^ N) K/B) I/Os. Each 
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Fig. 3. (i) Three slices (ii) The lines in xyt-spa.ce traced by S{t) and their projections. 



SCi can be constructed in 0{N log 2 Nlog^ N) expected I/Os and updated in 
0{logl N) I/Os. 

To answer a Q2 query at time tq with a rectangle R, we first determine 
the slice Si containing tq and then we query SCi to report all K points of 
S{t)r\R in 0{^N/B^ ■ -\-\ogQ N)-\-K/B) I/Os. Our global data structure 

uses 0{{N/B) log^ N) disk blocks in total. The structure can be constructed in 
0{N log 2 Nlog^ N) expected I/Os since we can compute the r^’s as follows. Let 
.4“^ (S') be the arrangement obtained by projecting the lines of S onto the tx- 
plane (Figure El (ii))- Let ki = NB’' for 1 < z < logg N. As in Section 01 we can 
compute, using 0{N log 2 Nlog% N) I/Os, the time instances Ti, I <i < log^ N, 
so that the t-coordinate of ki vertices of A^{S) is at most rf . Similarly, we 
compute the time instances rf , 1 < i < log^ N , so that the t-coordinate of ki 
vertices of A^(S) is at most rf. Set Tq = Tq = —oo and Ti = min{rf,rf} for 
0 < z < log^ N. Define the slice Si to be x [ri_i, r^] for 1 < z < log^ N. Si 
is guaranteed to contain less than 2NB^ events. Finally, as in the ID-case, we 
rebuild the data structure every 0{NB) events and obtain the following. 

Theorem 2. Let S be a set of N points moving in the xy -plane with fixed veloc- 
ities. S can be maintained in a data structure using 0{log% N) expected I/Os per 
kinetic event so that a Q2 query at time tq, with NB'^~^ < <p{tq) < NB"^ , can be 
answered in 0{^/ N/ B/B^~^ -\-log^ N)-\-K/B) I/Os. The structure can be built 
in 0{N log 2 A^logg N) expected I/Os and uses 0{{N/B) logg N) disk blocks. 

Slice Structure. We now describe our slice data structure SCi. Answering 
a Q2 query corresponds to finding the lines traced out by S in xyt-space that 
intersect a rectangle R on the plane t = tq parallel to the a;j/-plane. Note that 
a line I intersects R if and only if their projections onto the tx- and ty-planes 
both intersect. See Figure 0 (ii). Let denote the projection of S onto the tx- 
plane, and let Sf,Sf be the projection of the slice Si onto the tx- and <z/-planes, 
respectively. As earlier, we define a set of critical levels Xj of A(S''') within the 
projected window Sf and store it in a persistent B-tree Ti. We choose a random 
integer f G [0, y/NB^] and set Xj to be the (/ V NB^ + f )-level of A^{S). We have 
^/NJIT critical levels, and we define bundle Bj to be the V MB'- levels between 
critical level Aj_i and Xj. Refer to Figure 0(i). Using Lemma El we can prove 
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that the total expected size of the critical levels is 0{N + NB^ /V NB"^) = 0{N). 
For each bundle S* we construct a lZ?-data structure on S'^ , the projection of S 
onto the tj/-plane, as follows. 

Fix a bundle B*. A point p G S can leave and return to Bj several times, 
i.e., its a;-projection may cease to lie between the levels Xj-i and Xj and then 
it may appear there again. Therefore, for a point p G S, let Ai,... ,Ar be 
the maximal time intervals, each a subset of [ri_i,ri], during which p lies in 
B^y We map p to a set Sp = {ei, . . . ,6r} of segments in the ty-plane, where 

= UteA ^ segment contained in the tp-projection of the trajectory 

of p. Define S'j' = UpeS ~ ^i^^ce the endpoint of a segment 

in Sj corresponds to a vertex of Xj-i or Xj, = 0{N). We construct 

the window structure WI* on Our construction of slices guarantees that 

there are 0{NB^) vertices of in the tj/-projection Sf of the slice Si. 

Therefore, the remark at the end of Section lO implies that all the bundle 
structures use a total of 0{N/B) disk blocks and that they can be constructed 
in 0(Alog2 Alog^ N) I/Os. 





Fig. 4. (i) The ^ N/B' critical levels in the arrangement A{S^) of the projection of 
S onto the ta:-plane. (ii) One bundle in window structure built on the ty-projection of 
segments corresponding to points in bundle Bj in the te-plane. 



To answer a query Q2, we first use 7/ to find the critical levels, and thus 
bundles, intersecting the te-projection of R in 0{logg N) I/Os. For all 
0{i/N/B^) bundles Bj completely spanned by R^, we query the correspond- 
ing window structure to find all points also intersecting the yt-projection R^ of 
R using 0{^JN/B'- ■ -l-log^ A) +K/B) I/Os (Lemma|5I). Finally, we scan 
the (at most) two bundles intersected but not completely spanned by R^ using 
0{y/NB'‘ /B + K/B) — 0{y/N/B^ ■ + K/B) I/O and report the remaining 

points in R at time tq. Refer to Figure 0 (i). 

Lemma 6. Let S be a set of N points moving in the xy -plane with fixed veloci- 
ties, and let Si be a time slice x [Ti-\,Ti] that contains NB'^ events. We can 
preprocess S into a data structure SCi of size 0{N/B) in 0{N \og 2 N \ogg N) 
I/Os such that a Q2 query at time Ti-i < tq < Ti can be answered in 
0{i/N/Bi ■ -h logs N) + K/B) I/Os. 
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Abstract. In this paper we discuss the kinetic maintenance of the Eu- 
clidean Voronoi diagram and its dual, the Delaunay triangulation, for 
a set of moving disks. The most important aspect in our approach is 
that we can maintain the Voronoi diagram even in the case of intersect- 
ing disks. We achieve that by augmenting the Delaunay triangulation 
with some edges associated with the disks that do not contribute to the 
Voronoi diagram. Using the augmented Delaunay triangulation of the set 
of disks as the underlying structure, we discuss how to maintain, as the 
disks move, (1) the closest pair, (2) the connectivity of the set of disks 
and (3) in the case of non-intersecting disks, the near neighbors of a disk. 



1 Introduction 

Geometric objects that move with time appear in many problems in motion plan- 
ning, geometric modeling, computer simulations of physical systems and virtual 
environments, robotics, computer graphics and animation, mobile communica- 
tions, ad hoc networks or group communication in military operations. The aim 
is to answer questions concerning proximity information among the geometric 
objects, such as find the closest/farthest pair, report all near neighbors, predict 
collisions or report reachability between a pair of geometric objects. In many 
cases we can approximate the geometric objects by disks. The problem then 
reduces to answering proximity questions for a set of disks. 

The Voronoi diagram is a data structure that can be used to produce answers 
to many of these questions. Voronoi diagrams have been successful in robotics 
applications such as collision detection PU and retraction motion planning mg. 
Dynamic or kinetic Voronoi diagrams for moving objects in the plane have also 
appeared in the literature. There are papers discussing the maintenance of the 
Voronoi diagram for a set of points ^ > the maintenance of the Voronoi diagram 
for a set of moving convex polygons jS|, as well as for the maintenance of near 
neighbors of points in sets of moving points |0| ■ 

Algorithms for maintaining the Voronoi diagram for sets of non-intersecting 
moving disks have also appeared. In 0 the Voronoi diagram for disks with 
respect to the Euclidean metric is considered. The maintenance of the Voronoi 

* Supported by NSF grant CCR-9910633. 
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Fig. 1. Left: the edge connecting Bi and B2 is locally Delaunay because the exterior 
tangent ball of Bi, B2 and Di does not intersect D2. Right: the edge connecting Bi 
and B2 is locally Delaunay because the interior tangent ball of Bi, B2 and Di is not 
contained in D2. The exterior/interior tangent ball of Bi, B2 and Di is shown in light 
gray. 



diagram with respect to the power distance, also called the Power diagram, 
is discussed in p. The most appealing feature of the Power diagram is that 
it consists of straight arcs, in contrast to the Euclidean Voronoi diagram that 
consists of straight or hyperbolic arcs. The main drawback of the Power diagram 
is that the intersection between a disk and its Voronoi cell may be empty, even 
if the Voronoi cell is not empty In the Euclidean Voronoi diagram, however, 
a disk with non-empty Voronoi cell always intersects its cell m- 

In this paper we tackle the problem of maintaining the Voronoi diagram, or 
its dual the Delaunay Triangulation (DT) for a set of disks moving in the plane. 
The major contribution of this paper is that the disks are allowed to intersect. 
This enables us to not only report collisions between disks, but also to report 
when the penetration depth between two disks achieves a certain value, or when 
a disk is wholly contained inside another disk. Moreover, our data structure can 
be used for maintaining the connectivity of the set of disks as the disks move. 

The Voronoi diagram is maintained using the Kinetic Data Structure (KDS) 
framework introduced in In the KDS setting one maintains a geometric 
structure under continuous motion through a set of certificates proving its cor- 
rectness. An event queue is maintained for the failure times of these certificates 
and at each event the structure of interest and its kinetic proof are appropriately 
updated. 

The kinetization process relies heavily on the fact that the local Delaunay 
property for the edges in the DT ensures that the triangulation is globally De- 
launay. Let e be an edge connecting the disks Bi, B2 and having the disks Dx, 
D2 as its neighbors in the triangulation. The local Delaunay property states that 
the edge e is an edge of the DT if the exterior tangent ball of B\, B2 and Di 
does not intersect the disk D2, or if the interior tangent ball of Bi, B2 and Di 
is not contained in D2 (see Fig. Q). The global Delaunay property states that 
there exists an edge in the DT between two disks Bi and B2 if there exists an 
exterior tangent ball to Bi and B2 that does not intersect any other disk or if 
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Fig. 2. The Voronoi diagram for a set of disks (left) and the corresponding Augmented 
Delaunay Triangulation (right). The disks in dark gray are trivial. 



there exists an interior tangent ball to Bi and B2 that is not contained in any 
other disk. In this work we prove the relationship between the global and local 
Delaunay properties. 

Using the local Delaunay property, we can maintain the Voronoi diagram for 
the set of disks using two types of events, one of which appears only in the case 
of intersecting disks. The data structure that we use is called the Augmented 
Delaunay Triangulation (ADT) of the set of disks. It consists of the Delaunay 
triangulation of the disks augmented with some additional linear size data struc- 
ture associated with the disks that do not contribute to the Voronoi diagram 
(see Fig. 12 ). We call these disks trivial. Since trivial disks exist only when we 
allow disk intersections, the ADT differs from the DT only when we have disk 
intersections. 

An interesting property of the ADT is that the closest pair of the set of disks 
is realized between two disks that share an edge in the ADT. Thus, knowing 
how to maintain the ADT enables us to maintain the closest pair of the set of 
disks using a tournament tree on the edges of the ADT. The distance between 
two disks Bi and B2 is defined as : 

A/'n jd{bi,b2)-ri-r2, Bi % B2 sad B2 % Bi 

d{Bi,B2) - u r- u 

|^-2mm{ri,r2}, Bi C B 2 or B 2 C Bi 

where bi are the centers of the disks, their radii and denotes the Eu- 

clidean metric. If the set of disks does not have any intersecting disks the distance 
function m gives us the closest pair in the usual sense. If there are intersecting 
disks, then the closet pair with respect to (P^) is either the pair of non-trivial 
disks with maximum penetration depth among all intersecting pairs of disks, or 
the largest trivial disk and its container. 

Another important property of the ADT is that a subgraph of the ADT is a 
spanning subgraph of the connectivity graph of the set of disks. Knowing how 
to maintain the ADT enables us to maintain the connectivity of the set of disks 
by maintaining the afore-mentioned spanning subgraph. 
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Finally, the DT of a set of non-intersecting disks has the property that if we 
want to find the near neighbors of a disk we only need to look at its neighborhood 
in the DT. Therefore, in order to maintain the near neighbors of a disk we simply 
need to look at its neighborhood in the DT and update this neighborhood as 
the DT changes. We make this statement more precise in Section [71 

The rest of the paper is structured as follows. In Section |3 we introduce the 
Voronoi diagram for disks, and discuss some of its properties. Sectionj^is devoted 
to proving the relationship between the global and local Delaunay properties. 
In Section 2] we show how to kinetize the Voronoi diagram. In Section ^ we 
describe how to maintain the closest pair. In Section Elwe show how to maintain 
a spanning subgraph of the connectivity graph of the set of disks. In Section 
Qwe discuss the maintenance of near neighbors of disks. Finally, Section El is 
devoted to conclusions and further work. 



2 The Voronoi Diagram for Disks and Its Properties 

Let S' be a set of n disks Bj, with centers bj and radii rj. Let be the 

Euclidean distance. We define the distance 6{p, B) between a point p and 
a disk B — {b,r}, as S{p,B) = d{p,b) — r. We define the Voronoi diagram for 
the set S as follows. For each i ^ j, let Hij = {?/ S : S{y,Bi) < 6{y,Bj)}. 
Then we define the (closed) Voronoi cell of Bi to be the cell V) = C\j^i Hij- The 
Voronoi diagram VD(S) of S is defined to be the set of points which belong to 
more that one Voronoi cell. The Voronoi diagram just defined is a subdivision 
of the plane. It consists of straight or hyperbolic arcs and each Voronoi cell is 
star-shaped with respect to the center of the corresponding disk. In contrast to 
the Voronoi diagram for points, there may be disks whose corresponding Voronoi 
cell is empty. In particular, the Voronoi cell Vi of a disk Bi is empty if and only 
if Bi is wholly contained in another disk (see Property 2]). A disk whose 
Voronoi cell has empty interior is called trivial, otherwise is called non-trivial. 

We define the dual of VD(S') as follows. The vertices are the centers of the 
non-trivial disks. If ViC\Vj yf 0, we add an edge \bi,bj] for every open arc a 
of Vi nVj. It turns out that the dual graph is a planar graph and the size of 
both the Voronoi diagram and its dual graph is 0{n) jEl Properties 6 and 7]. If 
the Voronoi diagram consists of a single connected component and the disks are 
in general position, the dual graph is a generalized triangulation of the plane. 
By generalized we mean that the edges of the triangles may be curved arcs or 
polygonal lines instead of straight line segments. We assume throughout the rest 
of the paper that the Voronoi diagram consists of only one connected component. 
We shall refer to the dual graph of VD(5') as the Delaunay Graph DG(5') of S. 
If the disks are in general position we refer to the dual graph as the Delaunay 
Triangulation DT(S') of S. 

Let Bi and Bj be two disks such that no disk is contained inside the other. A 
ball tangent to Bi and Bj that does not contain either of the two is an exterior 
tangent ball. A ball tangent to Bi and Bj that lies in Bi n Bj is an interior 
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tangent ball. The following theorem couples the existence of edges in DG(5') 
with exterior and interior tangent balls of disks in S. 

Theorem 1 (Global Property). There exists an edge [bi,bj] in DG{S) be- 
tween Bi and Bj if and only if one of the following holds: (1) there exists an 
exterior tangent ball to Bi and Bj which does not intersect any disk Bk € S, 
k ^ i,j; (2) there exists an interior tangent ball to Bi and Bj, which is not 
contained in any disk Bk € S, k ^ i,j. 

Proof. (Sketch) Let [bi,bj] be an edge of 00(5”) . Then VidVj consists of at least 
one arc a with non-empty interior. Let y be an interior point of a. Consider the 
ball C centered at y with radius \6{y,Bi)\ = \6{y,Bj)\. Then C is tangent to 
both Bi, Bj and does not intersect any disk Bk, k yf i,j, if y ^ Bi D Bj, and is 
not contained in any disk Bk, k yf i,j, if y G BiC\ Bj. Conversely, let C be a 
common tangent ball of Bi, Bj, and let y be its center. If either assumption (1) or 
assumption (2) of the theorem holds, we have that 5{y, Bi) = 6{y, Bj) < 6{y, Bk), 
for all k yf i,j. Hence y is an interior point of some arc a oi ViC\Vj, and thus 
there exists at least one edge \bi, 6y] in DG(S'). □ 

In order to account for the trivial disks, we augment the Delaunay triangu- 
lation with some additional edges. For a trivial disk D we add an edge between 
D and its container disk. If D has more than one container we need to add an 
edge to only one of its containers, chosen arbitrarily. We call this structure the 
Augmented Delaunay Triangulation ADT(S') of S. The set of additional edges 
forms a forest, and the root of each tree in the forest is a non-trivial disk. Clearly, 
the forest has linear size. Hence the size of ADT(S') is still 0(n). 

3 The Local Property of the Delaunay Triangulation 

In this section we present the local Delaunay property for a set of possibly 
intersecting disks and we show that the local Delaunay property is a sufficient 
and necessary condition for a (generalized) triangulation of the set of disks to 
be globally Delaunay. We only consider non-trivial disks, since trivial disks do 
not contribute to the Voronoi diagram. 

Let TTij be the bisector of Bi and Bj. The bisectors are lines or hyperbolas 
which can be oriented. We define the orientation to be such that bi is to the 
left of TTij. Let ^ be the linear ordering on the points of Tr^y. Let Oij be the 
midpoint of the subsegment of bibj that lies either in free space or in Bi n 
Bj. We can parameterize TTij as follows: if p ^ Oij then Cij(p) = ~{^{P)Bi) — 
5{oij, Bi)); otherwise Cijip) = Bi) — 5{oij, Bi). The function fij is a 1-1 and 
onto mapping from tt^ to K. 

Let Bi, Bj and Bk be three disks such that no disk is contained inside another. 
The three disks may have up to eight common tangent balls. Among those we 
are interested in only two kinds: those balls that do not contain any of the three 
disks, which we call exterior tangent balls and those that are contained entirely in 
the intersection of the three disks, which we call interior tangent balls. Let Pi, Pj, 
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Pk be the points of tangency of the disks Bi, Bj, Bk with their common tangent 
ball. If CC\N{Pi,Pj,Pk) > 0 we call the common tangent ball the left tangent 
ball of the triple B^, Bj, B^,. If CCW(Pi, Pj, P^) < 0, we call the common tangent 
ball the right tangent hall of the triple Bi, Bj, Bk- Note that three disks have 
at most one left/right exterior/interior tangent ball. Finally, we define f:[j{Bk) 
to be the parameter value of the center c G iTij of the left tangent ball of Bi, 
Bj and Bk- Correspondingly, C^{Bk) is the parameter value of the center of the 
right tangent ball of Bi, Bj and Bk- 

Let T{S) be a (generalized) triangulation of S that is constructed as follows. 
The vertices of T{S) are the centers of the disks in S. An oriented edge efj in 
T{S) is an edge that connects the disks Bi and Bj and has as neighbors the 
disks Bk and Bi to its left and right, respectively. It is possible that the disks 
Bk and Bi are the same. The disk Bk is called the left neighbor of e^j and the 
disk Bi is called the right neighbor of efj. Note that the quadruple (i,j,k,l) 
uniquely defines edges in T(S'), i.e., there can be at most one oriented edge in 
the triangulation starting from Bi, ending at Bj and having Bk and Bi to its 
left and right, respectively. The left tangent ball of the triple Bi, Bj, Bk is called 
the left (tangent) ball of and similarly, the right tangent ball of the triple Bi, 
Bj, Bi is called the right (tangent) ball of e^j. We assume that for every edge in 
T{S) its left and right tangent balls exist. Then we can embed efj with a two-leg 
polygonal line bixbj, where cc is a point on iTij with parameter value (ij(x) in 
between ()j{Bk) and ((j{Bi). For every triangle Aijk G T{S) that connects the 
disks Bi, Bj and Bk, in counterclockwise order, we associate the left tangent ball 
Aijk of the triple Bi, Bj, Bk- This is called the Delaunay ball of Aijk- Note that 
there is an 1-1 correspondance between triangles A in 'T(S) and their Delaunay 
balls A. 

An edge efj in T{S) is called locally Delaunay if the predicate lnCircle(i3i, Bj, 
Bk,Bi) is false. A triangle A in T(S') is called locally Delaunay if all its edges 
are locally Delaunay. The InCircle predicate is defined below. 

Definition 1. Let Bi, Bj, Bk, Bi be four disks. The predicate \r\Q\rde{Bi,Bj, 
Bk,Bi) is true if k ^ I and either Bi intersects the exterior left tangent ball of 
Bi, Bj and Bk, or Bi contains the interior left tangent ball of Bi, Bj and Bk- 

Note that if an edge e^j is locally Delaunay then C(j{Bk) > f(j{Bi). This 
imples that if a triangle A is locally Delaunay, then the center of its Delaunay 
ball A lies in the interior of A. We are now ready to prove the main result of 
this section. 

Theorem 2 (Local Property). A (generalized) triangulation T{S) is the De- 
launay triangulation of S if and only if all the triangles in T{S) are locally 
Delaunay. 



Proof. It is straightforward to verify that if a triangulation is globally Delaunay 
then it is locally Delaunay as well. 
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Suppose now that we have a triangulation T{S) that is locally Delaunay 
but not globally. We assume without loss of generality that for all triangles the 
corresponding Delaunay balls are interior. If this is not the case we can increase 
the radii of all the disks by a sufficiently large quantity. The triangulation T (S') 
is not affected by this change, other than that all the Delaunay balls become 
interior. 

Since T(S) is not globally Delaunay there exists a triangle A that is locally 
Delaunay but its Delaunay ball A is contained inside some disk B — {b,r}. 
Consider the distance between the disk B and the Delaunay ball A. This distance 
is 6{A, B) = d(6, c^) + — r, where and — are the center and radius of 

A (interior Delaunay balls are considered to have negative radius). Among all 
triangles A for which A C B, choose A to be the one for which S{A,B) is 
minimized. 

Let e = [61,62] be the (oriented) edge of A that the segment c^6 intersects 
(see Fig.0). Let L be the two-leg polygonal line bic^b 2 - Since c^6 intersects e, 
6 must lie in the half-plane H bounded by L that contains e. Let Z\' be the left 
neighboring triangle of e. Since both A and Z\' are locally Delaunay the quad 
Q = 610^620^, is contained inside Z\UZ\', and clearly 6 cannot lie inside Q. But 
then we have A' C B, and moreover S{A',B) < S{A,B), which contradicts the 
fact that S{A,B) is minimal. □ 

4 Kinetizing the Delaunay Triangulation 

The framework that we use for maintaining the Voronoi diagram or equivalently 
the Augmented Delaunay triangulation is the Kinetic Data Structure (KDS) 
framework. The geometric attribute that we want to maintain as the objects 
move is called the configuration function, e.g., the Voronoi diagram of the set of 




Fig. 3. Proof of the local property. 
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disks. In the KDS setting we maintain a set of certificates that are conditions 
which ensure the correctness of the configuration function, e.g., that an edge in 
DT(S') passes the InCircle test. As the objects move the certificates fail. These 
are the critical events for the KDS and moreover the corresponding times are 
the only times that the configuration function can possibly change. When a 
critical event happens we need to change the set of certificates and possibly 
the configuration function itself. In order to efficiently update the configuration 
function we maintain an event queue of the certificates w.r.t. their failure times. 
When a critical event takes place we remove some certificates from the queue 
and add some new ones. For more details on KDSs see p. 

Maintaining the Voronoi diagram or its dual, the Delaunay triangulation, for 
a set of points moving on the plane is straightforward p. This is due to the 
local property of the Delaunay triangulation, which states that if the triangles 
of the Delaunay triangulation are locally Delaunay, then the triangulation is the 
Delaunay triangulation. When one of the conditions fails we simply have to do an 
edge-flip operation to restore the correctness of the Delaunay triangulation. The 
same principle is exploited to maintain the power diagram of non-intersecting 
moving disks and the Voronoi diagram for rigidly moving polygons 

In the case of non-intersecting disks the very same ideas can be used. The 
local Delaunay property is also true for the Delaunay triangulation of disks, as 
we showed in the preceding section, and thus the critical events happen at times 
when four disks are cocircular or when three disks lying on the convex hull of 
S have a common tangent line. In fact if we add a disk at infinity and connect 
every disk lying on the convex hull of S with the disk at infinity, the compactified 
version of the Delaunay triangulation of S consists of triangles only, and every 
triangle has exactly three neighboring triangles. In this setting, the case of three 
disks having a common tangent reduces to a cocircularity event with one of the 
disks being a disk at infinity. When such a cocircularity event happens we only 
need to perform an edge-flip operation to restore the correctness of the Delaunay 
triangulation, and its dual the Voronoi diagram. 

However, when we allow disk intersections the situation changes considerably. 
Unlike the points’ case, there are disks that are not associated with a particular 
Voronoi cell, namely the trivial disks. We need to account for these disks, since as 
the disks move some of the trivial disks may become non-trivial and vice versa. 
This is done by considering the Augmented Delaunay triangulation instead of the 
Delaunay triangulation. There are two types of events that change the combina- 
torial structure of ADT(5') : the cocircularity and the appearance/disappearance 
event. Both events are associated with edges of the ADT(5'). In particular, an 
edge in DT(S') is associated with a cocircularity and a disappearance event. An 
edge in ADT(5')\DT(5') is associated with an appearance event. We now discuss 
each type of event separately. 

The cocircularity event happens when four distinct disks have a common 
exterior or interior tangent ball. Let i = 1, 2, 3, 4 be the four disks associated 
with the cocircularity event and let [61,63] be the edge to be flipped. We need 
to delete that edge and add the edge [62, 64] (see Fig. 0[left)). 
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An appearance event occurs when a disk Bi contained inside a disk B2 
is no longer wholly contained inside -62- There are two possibilities when this 
happens: (1) B2 is a trivial disk and (2) B2 is a non-trivial disk. If B2 is a trivial 
disk we delete the edge [61, 62] and add the edge [&i, 63], where B3 is the container 
disk of i?2- If B2 is a non-trivial disk we need to first check its neighbors in the 
DT to see if Bi is contained in any one of them. If such a neighbor B3 exists we 
delete the edge [&i, 62] and add the edge [61, 63]. If Bi is not contained in any of 
the neighbors of B2, we need to identify the edge [62, ^3] that corresponds to the 
edge of the Voronoi cell of B2 that the half- line 62 &i intersects. Then duplicate 
this edge and add the edge [61,63], thus creating two new triangles in DT(5') 
(see Fig. Enright), from right to left). 

A disappearance event takes place between two disks B\ and B2 when, 
e.g., Bx becomes wholly contained in i?2- The edge [61,62] belongs to two tri- 
angles with a common third point 63 corresponding to a disk B^. When the 
disappearance event happens we need to delete the edge [61 , 63] , and identify 
the two edges [62,63], thus deleting two triangles from DT(5') (see Fig. right), 
from left to right). 

In any case, when an edge disappears we deschedule all the events associated 
with that edge. When an edge appears we schedule all the events corresponding 
to that edge and reschedule all the cocircularity events, if any, in which the new 
edge participates. 

The number of certificates that we maintain is 0 {n), since we have a constant 
number of certificates per edge in ADT(S'), and the number of edges in ADT(S') is 
0 {n). The time to process the cocircularity and disappearance events is 0(log n). 
The time to process the appearance event is 0 {n). If the disks move along pseudo- 
algebraic trajectories, the total number of cocircularity events that our KDS has 
to process is 0(n^/3(n)), where /3(n) = Xs{n)/n and Xs{n) is the maximum length 
of a (n, s) Davenport-Schinzel sequence for some constant s. The total number 




Fig. 4. The cocircularity (left) and appearance/disappearance events (right). Top row: 
the Voronoi diagram. Bottom row: the Augmented Delaunay triangulation. 
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of appearance/disappearance events that our KDS processes is obviously O(n^). 
Hence the total number of events that have to be processed is 0{n^ f3{n)). A 
lower bound of I7(n^) on the number of events can also be shown. 

5 Closest Pair Maintenance 

In this section we discuss how to maintain the closest pair of a set S of disks. 
The distance function that we use is given by relation ©■ The trivial way to 
do the maintenance is to consider all (”) pairs of disks and maintain the one 
of minimum distance with respect to the distance function dU- However, the 
Augmented Delaunay triangulation of S has the following property, the proof of 
which is omitted from this version of the paper. 

Theorem 3. Let B\, B 2 be the closest pair in S. Then there exists an edge 
[&i, 62 ] in ADT{S). 

The theorem above suggests that we only need to look at 0{n) edges in order 
to determine the closest pair, namely the edges of ADT(S'). In particular, we 
need to maintain a tournament tree T on the edges of ADT(S'). Before describing 
how to actually maintain T we need some definitions. Let t\ and ^2 be two nodes 
of T. We say that t\ A ^2 if the depth of t\ is smaller than the depth of ^2 in T, 
or if ti and t 2 are of the same depth and t\ is to the left of t 2 in T. A node t\ 
is adjacent to a node ^2 in T if they have the same parent. A node t is a loser if 
its parent is its adjacent node. Finally, a node t is a winner if its parent is itself. 

The certificates associated with T are the winner- loser relationships. The tree 
T changes due to changes in the winner-loser relationships or due to changes 
of the ADT(S'), because of cocircularity and appearance/disappearance events. 
When a winner-loser relationship changes we simply propagate the new winner 
up the tree, deschedule the old winner-loser relationships and schedule the new 
ones. During this propagation we visit 0(log n) nodes of the tree and sched- 
ule/deschedule O(logn) certificates in total. Hence the cost per winner-loser 
relationship change is 0(log^ n). When an edge disappears we replace the corre- 
sponding leaf node with the last loser leaf node and delete the last winner and 
loser leaf nodes. Then we propagate the last loser leaf node up the tree as in 
the case of a winner-loser relationship change. Again this takes 0(log^ n) time. 
Finally, when an edge appears we create two new leaf nodes in the tree: one for 
the new edge and one for the first leaf node. We attach the two new nodes under 
the current first leaf node and propagate the winner between these two nodes 
up the tree. Once again this takes 0(log^ n) time. 

The number of changes in a single winner-loser relationship is constant for 
disks moving along pseudo-algebraic trajectories of constant degree. Hence the 
number of events that we have to process is dominated by the number of combi- 
natorial changes of ADT(S'), which is 0(n^/3(n)). All, but the appearance event, 
are processed in 0(log^ n) time; the appearance event is processed in 0{n) time. 
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6 Kinetic Connectivity of Disks 

In this section we discuss how to kinetically maintain the connectivity for a set 
of disks of different radii. The problem for unit disks has already been studied 
in 0. 

The connectivity graph K oi a, set of disks S is defined as follows. The 
vertices of K are the centers of the disks in S. Two disks share an edge in K 
if they intersect. Let G be the subgraph of ADT(S') defined as follows. An edge 
e in ADT(S') belongs to G if and only if it is an edge between two non-trivial 
intersecting disks or between a trivial disk and its container disk. Clearly, G is 
a subgraph of the connectivity graph K ol S (modulo multiple edges between 
two disks in DT(S')). The important property of G is captured by the following 
theorem, the proof of which is omitted from this version of the paper. 

Theorem 4. If Bi,B 2 G S belong to the same eonneeted eomponent in K, then 
there exists a path in G that eonneets B\ and B 2 ■ 

In other words G is a spanning subgraph of K. This is really important since 
the size of K is C(n^) in the worst case, whereas the size of G is 0(n). We can now 
maintain the connectivity of the disks using the dynamic graph data structure 
of Holm, de Lichtenberg and Thorup [Z]. This data structure supports edge 
insertions and deletions in 0(log^ n) amortized time, and connectivity queries 
in 0(logn/loglogn) time. The graph that we maintain is the graph G defined 
above. Once we have ADT(S'), maintaining G is really simple. First we color the 
edges of ADT(S') as follows: edges between non-intersecting non-trivial disks are 
green, edges between intersecting non-trivial disks are orange and edges between 
trivial disks and their containers are red. Clearly, G is the union of orange and red 
edges. The color of an edge changes if the corresponding disks become tangent. In 
particular, whenever a green edge becomes an orange edge or whenever an orange 
or a red edge appears we simply add it to G. Whenever an orange edge becomes 
a green edge or whenever an orange or a red edge disappears we simply delete 
it from G. Since the cost per insertion/deletion of edge in G is 0(log^ n), in the 
amortized sense, the cost per update of G is 0(log^ n) (amortized), except when 
we have an appearance event, in which case the update cost is 0(n). The number 
of times that two disks, moving along pseudo-algebraic trajectories of constant 
degree, can become tangent is constant. This implies that the number of events 
due to disk tangencies is O(u^). The total number of events for maintaining G is 
thus dominated by 0{n^ j3{n)) , which is the number of times that the Delaunay 
triangulation of the set of disks can change combinatorially. 

7 Near Neighbor Maintenance 

Suppose that we have a set S of non-intersecting moving disks. Let P be a disk 
in S for which we want to know the disks in S that are within a certain, possibly 
time varying, distance Rp from P. Let Gp be the disk centered at P with radius 
Rp. The obvious approach is to maintain the distance from P to every other 



Voronoi Diagrams for Moving Disks and Applications 73 

disk in S and keep those that intersect Cp. In fact, we can do better than that. 
If we are maintaining the DT of S, the only disks that can enter or exit Cp are 
those that are end points of edges of the DT crossing Cp exactly once. This is 
the essence of the following theorem, the proof of which we omit from this paper. 

Theorem 5. Let 7'{S) be the DT{S) and let P S S. If a disk Q G S en- 
ters/exists the disk Cp at some time to, then there exists an edge in T(5') between 
Q and some disk that intersects Cp. 

Maintaining the near neighbors of P then reduces to maintaining the DT of 
S and updating the set Ep of DT edges, one end disk of which intersects Cp 
and the other does not. The set Ep changes when disks enter or exit Cp. Edge 
flips due to the maintenance of DT(S') may also change Ep. In case we want 
to maintain the A:-nearest neighbors of P the same idea applies with two slight 
modifications: (1) the distance Rp is defined to be the distance of the center 
of P from Pfc, where Pk is the fc-th nearest neighbor of P and (2) the edges of 
DT(S') adjacent to Pk are all included in Ep. 

We omit the details of the algorithms since they are essentially the same as 
the corresponding algorithms for points in jOj. 

8 Conclusion 

In this paper we presented how to kinetically maintain the Voronoi diagram for 
a set of disks moving in the plane. The key steps in the kinetization process were 
the introduction of the Augmented Delaunay triangulation and the establish- 
ment of the relationship between the local and global Delaunay properties. We 
showed how to maintain the closest pair of the set of disks and how to main- 
tain a spanning subgraph of the connectivity graph of the set of disks using the 
Augmented Delaunay triangulation as the underlying structure. Finally, if the 
disks do not intersect, we discussed how to maintain the disks that are within 
a prescribed distance from a reference disk or how to maintain the A:-nearest 
neighbors of a reference disk. 

We strongly believe that the results presented in this paper can be generalized 
to more general additively weighted Voronoi diagrams, in which the weights can 
be positive as well as non-positive. We would also like to extend the results pre- 
sented here to general smooth convex objects or to environments where obstacles 
are present. Finally, the best known lower bound on the number of combinatorial 
changes of the DT is I7(n^), whereas our upper bound is 0(n^/3(n)). Given this 
upper bound, the algorithms presented here for maintaining the DT, the closest 
pair and disk connectivity are not efficient; it would be of interest to find kinetic 
data structures that solve these problems efficiently, or prove a tighter lower or 
upper bound on the number of combinatorial changes of the DT. 
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Abstract. Our goal in this paper is the development of fast algorithms 
for recognizing general classes of graphs. We seek algorithms whose com- 
plexity can be expressed as a linear function of the graph size plus an 
exponential function of k, a natural parameter describing the class. Our 
classes are of the form Wk{G), graphs that can be formed by augmenting 
graphs in G by adding at most k vertices (and incident edges). If G is 
the class of edgeless graphs, Wk{G) is the class of graphs with a vertex 
cover of size at most k. 

We describe a recognition algorithm for Wk{G) running in time 0{{g + 
k)\V{G)\ + (fk)^), where g and / are modest constants depending on 
the class G, when 1/ is a minor-closed class such that each graph in G 
has bounded maximum degree, and all obstructions of G (minor-minimal 
graphs outside G) are connected. If G is the class of graphs with maximum 
degree bounded by D (not closed under minors), we can still recognize 
graphs in Wk{G) in time 0(\V {G)\{D -f fc) -f k{D -|- k)'°'^^). 

Our results are obtained by considering minor-closed classes G for which 
all obstructions are connected graphs, and showing that the size of any 
obstruction for WkiG) is 0{tk^ + t^k^), where t is a bound on the size 
of obstructions for G- 



1 Introduction 

One of the principal goals of algorithmic graph theory is to determine elegant 
and efficient ways to characterize classes of graphs. A particularly enticing ap- 
proach applies readily to graph classes closed under minor containment (defined 
formally in Section n. In their seminal work on graph minors, Robertson and 
Seymour proved that for any minor-closed graph class Q, there is a finite 

number of minor-minimal graphs (the set of obstructions, or obstruction set) in 
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the set of graphs outside of Asa consequence, G is in C/ if and only if no graph 
in the obstruction set of C/ is a minor of G; if the obstruction set of Q is known, 
there exists a polynomial-time recognition algorithm for Q 

Unfortunately, finding the obstruction set of a class G is unsolvable in gen- 
eral jFR,S87lKI j!t4lva,nt)llj and appears to be a hard structural problem even for 
simple graph classes, largely due to the rapid explosion in the size of the obstruc- 
tions frUK54IRa m 55IThi00| . Following a brute force approach, it is possible to 
build a computer program enumerating graphs and searching among them for 
obstructions. A crucial drawback of this method is that there is no general way 
to bound the search space [lva,nhOIFIj94j : more sophisticated methods are also 
possible |DinH5| . It is thus necessary to determine upper bounds on the sizes 
of the obstructions for special graph classes. Results of this type have been ob- 
tained for graphs with bounded treewidth or pathwidth as well as more 

general graphs 

In this paper we settle the question of the combinatorial growth for graph 
classes created from simpler ones. We augment a graph class by adding at most 
k vertices (and adjacent edges) to each graph in the class. More formally, for 
each graph class Q and integer k > 1, we define Wk{G) to consist of graphs G 
which are within k vertices of G (IM, page 196), namely all graphs G such 
that the vertices of G can be partitioned into sets and S 2 , where |5i| < k and 
the subgraph of G induced on the vertices in S 2 is in the class G- The notion 
of “within fc” can be used to easily define the classes of graphs with vertex 
cover of size at most k, or graphs with vertex feedback set of size at most k: G 
is the class of edgeless graphs (i.e. ob(C/) = {^^ 2 }), or the class of forests (i.e. 
ob(C/) = {K 3 }), respectively. Moreover, if G is closed under taking of minors, 
then so is Wk{G) |FL88ILR?^ . We conjecture that the sizes of obstructions for 
G and Wk{G) are related. 



Conjecture 1. If C/ is a minor-closed graph class whose obstructions have no more 
than t vertices, then the size of the obstructions for Wk{G) depends only on k 
and t. 



We show that the above conjecture is true for any class G in which no graph 
has degree greater than a fixed constant D, provided the class is minor-closed 
with all obstructions connected (in the proof of Lemma 0 we show that this is 
equivalent to the class being closed under disjoint union). In particular, we prove 
that for such a G, any graph in the obstruction set of WkiG) has size bounded 
by a polynomial in fc, D, and (for D > 5) C, where G is an upper bound on the 
length of paths of degree- two vertices (“chains”) in the obstruction set of G- As 
intermediate steps in proving this result, we develop lemmas demonstrating the 
existence of leafy trees in sparse (bounded degree) graphs, constituting results 
of independent interest. 

Making use of advances in construction of fixed-parameter tractable algo- 
rithms we employ the method of “reduction to a problem kernel” 

to obtain recognition algorithms such that an exponential function on k con- 
tributes only an additive term to the overall complexity. This builds on previous 
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work [IMeh84IH(Tl,*-{j : our results can be seen as generalizations of the algorithms 
designed for the vertex-cover problem jHKR,98IN btfflK IK.Iffflj . Fast algorithms of 
this type have been generated for other problems IDF95IFPTh5IKSThHDFS!^ 

We make use of our upper bound on the size of obstructions to obtain an 
0{{g{t) + k)\V {G)\ + {f{k, t))^)-time algorithm recognizing Wk{G) when G satis- 
fies the following three conditions: G is closed under taking of minors; each graph 
in G has degree bounded by D; and the obstructions for G are connected and 
of size bounded by t. Here g and / are polynomial functions with multiplicative 
constants depending on the class G- Finally, we demonstrate that similar time 
complexities can be achieved even when the closure restriction is removed. We 
present an 0{\V{G)\{D -|- A:) -I- k{D + /c)*^+^)-time algorithm for the recognition 
of Wk(G), where G is the class of graphs with maximum degree bounded by D. 

After establishing notation in Section O we demonstrate in Section 0 that 
we can obtain a bound on the size of a graph G when the disjoint union of 
stars is excluded as a minor and there are bounds on the degree and length of 
chains. Building on this result, in Section 2] we establish a bound on the size 
of obstructions for Wk{G)- Sections El and El making use of these results, estab- 
lish polynomial-time fixed-parameter tractable algorithms. In this conference 
version, many proofs are omitted or only sketched. 



2 Preliminaries 

We make use of standard graph-theoretic notation. A graph G has vertex set 
V{G) and edge set E{G). We define nbrG(S') to be {r> G V{G) \3w ^ S, (v,w) G 
A(G)}, and G[S'] to be the subgraph of G induced on S. For v G R(G), we 
define the degree of v in G or deg(j(u) to be |nbrG({w})|, the set of pendant 

vertices pend(G) to be {u G V{G) \ dege(u) = 1}, and the set of internal 

vertices int(G) to be F(G)\pend(G). The degree bound of the graph G, denoted 
Z\(G), is max^gy(c) degfj(w). For G a finite graph class, we define max-size(C/) = 
max{|H(G)| | G G G}', if ^(G) = max{Z\(G) | G G f/} can be bounded above by 
a constant, G is hounded degree. 

At times we will alter a graph by replacing paths by edges. We call a path 

of a graph G an a- chain if it has length (number of edges) a, all its internal 

vertices have degree two in G, and its end vertices are either adjacent or have 
degree not equal to two. We denote as chain (G) the largest a for which there 
exists an a-chain in G, setting chain(G) = 0 when E{G) = 0. A graph G is 
resolved if chain(G) = 1. Finally, for any graph class G, we define chain(^) to be 
max{chain(G) | G € G}- 

Throughout this paper we will use G to denote a graph class that is closed 
under taking of minors or, equivalently, a minor-closed graph class. A graph G 
is a minor of a graph FA if a graph isomorphic to G can be formed from iF by a 
series of edge and vertex deletions and edge contractions. The contraction of an 
edge e = (m, v) in G results in a graph G', in which u and v are replaced by a new 
vertex v^ {V{G') = {ve} \JV{G) \ {u, ?;}) and in which for every neighbor w of u 
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or V in G, there is an edge (w, Vg) in G' . An obstruction of ^ is a minor-minimal 
graph outside G; we use ob(C/) to denote the set of obstructions of Q, or the 
obstruction set of G- We use G < H and G < H to denote that G is a minor of 
or an induced subgraph of H, respectively. 

We use Kr to denote the complete graph on r vertices and to denote the 
complete bipartite graph with r and s vertices in the two parts of the bipartition. 
The graph Ki^r is also known as a star. We use to denote the graph 

consisting of fc -I- 1 disjoint copies of Ki^r- 

3 Excluding Disjoint Stars 

We first establish Theorem P which shows that the absence of a minor 

combined with a bound on the degree and on the length of a chain results in 
a graph of bounded size. The statement of the theorem looks complicated; the 
important fact is that the bound obtained is polynomial in k, r, the degree 
bound, and the chain length. In Section 0 we will make use of the theorem by 
showing that any sufficiently large graph with the aforementioned restrictions 
must contain as a minor. 

Theorem 1. Any -minor-free connected graph G where A{G) < D for 
D > 3 and chain(G) < G has at most f(k, r, D, G) vertices, where 7 = max{r — 
1 , T)} and 



f{k,r,D,G) 



{k + 1){D -k 1 ), ifr = l 

{D+l){k{D^ + l) + l), ifr = 2 

^{D + l){k{D^ + l) + l)){GD + 2) ifr = 3 

5{GD -k 2 )( 7(1 -k 7)(fc(4(7 — 1 )^ -k 1 ) -k 1 ) — 1 ) i/ r > 4. 



Proof. For each value of r, we prove the theorem by contradiction, assuming 

r fc+1 

that |E(G)| > f{k,r,D,G) and showing that as a consequence ^ G. The 
cases r = 1 and r = 2 follow by simple reasoning about greedy algorithms to 
find . For the case in which r = 3, we find a minor of G which is a resolved 

T fc+1 

tree with sufficiently many vertices, and then show that this tree contains , 
by using the following Ramsey-theoretic lemmas. 



Lemma 1. If T is a tree, R is the set of vertices of degree at least r in T, and 
\R\> k{{A{T))^ + 1), then diT. □ 



Lemma 2. For G connected, |int(G)| > max{ 



Lemma 3. Any connected graph G contains as a minor a connected resolved 
graph H such that \V{H)\ > chain(G)^G)-H 2 ° 
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Lemma 4. Any connected resolved graph G eontains as a minor a resolved tree 
T such that \V(T)\ > and A{T) < A{G). □ 

To conclude the case r = 3, we apply Lemmas 0 and 0 to conclude that G 
contains as a minor a resolved tree T such that A{T) < A{G) and |t^(T)| > 
5 (chain(G)^{'G)+2) - 5 ^ 00 ^ > {D + l){k{D^ + 1) + 1)). Each internal vertex 
of T has degree at least 3. By Lemma E) |int(T)| > (fc(Z\(G)^ -b 1) -b 1) > 
(k{A{T)^ + 1) + !)• Applying Lemma ^ we conclude that A T A G, as 

needed to obtain a contradiction. 

For the case r > 4, we apply the same reasoning as above, but continue with 
the following lemma. 

Lemma 5. For any resolved tree T , if\V{T)\ > 2, then |pend(T)| > i|y(T)|-bl. 
Proof sketch. Induction on the number of pendant vertices. □ 

Applying the line of argument from the r = 3 case shows that when r > 4, G 
contains as a minor a resolved tree T where |pend(T)| > 5 (chaii!(G)^(G)+ 2 ) + 1 — 

b{C^D+ 2 ) + 1 — 7(1 + 7) (^((27 — 2)^ -b 1) -b 1). This allows us to find a 
minor. We construct a tree U from T by iteratively finding and contracting an 
edge (it, u). It, i> ^ pend(T), where degy(it) -b degy(u) < r -b 1. Any vertex in U 
that is not the result of a contraction has degree at most A{T). A vertex in U 
that is the result of a contraction of an edge e will have degree at most r — 1, as 
the sum of the degrees of the endpoints of e is at most r -b 1. We conclude that 
A{U) < max{r — 1,Z\(T)} = m. Clearly pend(G) = pend(T) and for any edge 
(u, v) G E{U), It, V ^ pend(G), it must be the case that degy(ii)-bdeg^(i;) > r-b2. 
Since U AT, it will suffice to prove that ^ U. 

By Lemma|21 |int([/)| > > (1 -b m)(fc(2m — 2)^ -b 1). Elementary 

arguments show that f7[int(f7)] contains a matching M where \M\ > > 

k{2m — 2)^ -b 1. We form a new tree U' from U by contracting all the edges in 
M. Since the matching consists of edges in {7[int(G)], for each edge (u,v) G M, 
neither u nor v is in pend(G). For any edge (it, v) G M, deg;y(it)-bdeg[;(w) > r-b2. 
Consequently, the contraction of an edge in M will result in a new vertex of 
degree at least r in [/'. For R the set of vertices in G' with degree at least r, 
clearly \R\ > \M\, or |i?| > fc(2m — 2)^ -b 1. 

Since the degree of a vertex w in U' formed by contracting an edge (ii, v) will 
be deg[/(it) -b degy(u) — 2, and A{U) < m, we can conclude that degy/(u>) < 
2m — 2. If a vertex is not an endpoint of a contracted edge, its degree in IT 
will be the same as its degree in U and therefore it will be at most A{T). As 
A{T) > 3, we can conclude that A{U') < 2m — 2, or \R\ > k{A{U'Y + !)• 
Applying Lemmani we have the contradiction A U' A U A T A G. □ 

We remark that an easy corollary of the proof in the case r > 4 is that any 
connected resolved graph of at least lOfc vertices contains a spanning tree with at 
least k leaves. This remark can improve the time complexity for the algorithm 
solving the fc-LEAF Spanning Tree problem (iHEna; EEini, pages 40-42), 
from 0(n -b (2/c)^^) to 0{n + (10/c)^^). 
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4 Sizes of Obstructions 

Since the function / in the statement of Theorem E is polynomial in its argu- 
ments, the following theorem gives a bound on the size of obstructions of within-/c 
graph classes that is polynomial in fc, the degree bound, and chain(ob(^)). 

Theorem 2. If G is a bounded degree minor-elosed graph elass then 
max-size(ob(Wfe(5))) < /(fc, A{G) -k 1, A: -b A{G) -k 1, (fc -k l)chain(ob(^))). 

Proof. Work of Dinneen [Din97] shows that for any minor-closed disjoint-union- 
closed graph class Q, if a graph in ob(Wfe(5)) is the disjoint union of r -k 1 
nonempty connected graphs Cq, ■ ■ ■ -,Cr, then there exists a partition of fc -k 
1 into at most r -k 1 integers ko,ki, . . . ,kr such that for i = 0,...,r, Ci € 
ob(W/ci {G))- Thus it suffices to prove this theorem for the connected obstructions 
in ob(Wfc(6))- 

We first bound chain(ob(Wfe(t/))) and Z\((ob(Wfe(t/))). Consider a graph H G 
ob(Wfe(t/))) and an edge e G E{H)- We form a new graph H' = (V (H),E{H) — 
{e}) by removing e from H. Since H' is smaller than H, H G ob(Wfc(t/)), 
and ob(Wfe(t/)) is a set of minor-minimal elements, we can conclude that H' G 
yVk{G)- Consequently, by the definition oiWk{G), we can partition V{H') into 
sets Si and S 2 where |S'i| < k and H'[S 2 ] G G- It is not difficult to see that 
A{H') < A{H'[S 2 ]) + IS'ij. Since that A{H'[S 2 ]) < A{G), 1511 < k, and A{H) < 
A{H') -k 1, we have A{H) < k + A{G) + 1- The bound on chain(_ff) comes from 
the following lemma. 

Lemma 6. For any minor-closed disjoint-union-closed graph class G , 
chain(ob(>Vfe(0))) < (fc-k l)chain(ob(^)). □ 

If L = is a minor of H, then since H is connected and L is not, L 

must be a proper minor of H. But this leads to a contradiction. Since ob(yVfc(5)) 
is defined to be a minor-minimal set of obstructions for WkiG), L G Wk{G)- By 
the definition of Wk{G), L must contain a set 5, |5| < fc, such that G' = 
L\V{L) — 5] G 5 . Clearly, A{G') < A{G) and therefore cannot be a 

subgraph of Gb S must then contain at least one vertex in each of the disjoint 
copies of Ki 2 \{g)+i as a subgraph in L, as otherwise G' contains a copy of 
Ki^A(g)-^-i as a subgraph. Since |5| < fc and the number of copies of Ki A(g)-t-i 
is fc -k 1, we obtain a contradiction. 

We can thus conclude that H is L- minor free. Therefore, Theorem Q implies 
that \V{H)\ < f{k, A{G) -k 1, fc -k A{G) -k 1, (fc -k l)chain(ob(C/))). □ 

For A{G) — 0, TheoremEI implies that the obstructions for the class of graphs 
with vertex cover at most fc have size at most (fc -k l)(fc -k 2). In the case where 
A(G) = 1, the upper bound is (fc -k 3)(fc((fc -k 2)^ -k 1) -k I) or 0{k‘^)\ in the case 
A{G) = 2, the upper bound is |(fc -k 4)(fc((fc -k 3)^ + I) + I)(chain(ob(t/))(fc -k 
3) -k 2) or 0(chain(ob(t/)))fc^); and when A{G) > 3, the upper bound is in 
0(chain(ob(5))fc^(fc -k A{G))^)- 

A minor-closed graph class G has bounded degree if and only if Ki A(g)-i-i is 
one of its obstructions. This implies that if t is the size of the biggest obstruction 
in ob(t/), A{G) < t. Since chain(ob(C/)) < t, we can now conclude the following. 
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Corollary 1. If Q is a bounded degree minor-elosed disjoint-union-elosed graph 
class then max-size(ob(>Vfc(5))) = 0{tk^ + f'kf) where t = max-size(ob(t/)). 

5 Recognizing Graphs within k Vertices of a 
Bonnded-Degree Minor-Closed Class 

When k is considered to be part of the input and Q is characterized by a nontrivial 
property, the problem of deciding whether G € Wk{G) is NP-complete PW)| . 
In contrast, when k is viewed as a parameter and G is any bounded degree 
minor-closed disjoint-union-closed graph class, we are able to obtain a fast fixed- 
parameter tractable algorithm, as shown in this section. 

Theorem 3. Let G be any minor-closed graph class with the following proper- 
ties: (1) G contains only graphs of degree bounded by D > 3; (2) chain(ob(t/)) < 
C; (3) all obstructions of G are connected; and (4) checking whether G € G can 
be executed in 0{\V{G)\ - erg) time where ag is a constant depending on the class 
G- Then, for any k, there exists an 0{{D -\- k -\- ag)\V{G)\ -\- ag{f{k, D -\- 1, D -\- 
k,G{k-\- l)))'^)-time algorithm that decides if a graph G is in Wk(G) and, if so, 
produces a set S C V{G), [S'! < k, such that G\V(G) — S] G G- 

Proof. We propose the following algorithm. 

1 {v gV{G) \ degg {v) > D + k. 

2 If |A| > fc then sinswer NO, else •<— G[P(G) — A] . 

3 k' k — \ A\ . 

4 While there exists in a /3-chain where (I > C {k' -\- 1) , 

replace it with a (G(/c' -|- l))-chain. 

5 Remove from H all its connected components that are in G- 

6 If \V(H)\ > f(k' , D -\- 1, D -\- k,G{k' -\- 1)) then answer NO 

7 Answer YES iff H[V{H) — S'] G G for some S' C V{H), < k' 

{S^ AVJS') 

The following technical lemmas are needed to prove correctness. 

Lemma 7. For any minor-closed graph class G, for any a > chain(ob(^)), and 
for any G G G, the subdivision of any edge in any a-chain of G results in a graph 
that is also a member of G ■ tO 



Lemma 8. If G is a minor-closed disjoint-union-closed graph class, then, for 
any k > 0, no graph in ob(Wfc(t/)) contains a member of G as a connected 
component. □ 

To see that the algorithm is correct, we first observe that if there exists a 
solution, it will contain every vertex of degree greater than D -\- k and hence 
|t 1| < fc (step 2). The problem is then reduced to the question of whether H = 
G[V{G) — A] is in WkfG), for A{H) < D -\- k. As a, consequence of Lemma 0 
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and property 2 of Q, chain(ob(>Vfc/(5))) < C{k' + 1). By combining this fact 
with Lemma 0 (for a = C{k' + 1)), since Q is minor-closed we can conclude 
that step 4 preserves membership in W^' {Q) and results in a graph H where 
chain(iL) < C{k' -k 1). Property 3 of C/ and Lemma|H| imply that membership in 
Wfc' (G) is invariant under the removal from H of all its connected components 
that are members of G- In this way, the problem is reduced to determining 
whether J G Wk'iG) where chain(J) < C{k' + 1) and A{J) < D + k, where for 
Ji, , Jm the connected components oi J , Ji ^ G ■ Each Ji must then contain 
at least one vertex of a possible solution S. 

If J G yVk'{G), there exists a partition (S'!, . . . , Sm) of S where for i = 1 . . . m, 
Si = Sf\ V{Ji), ki = l^il, Ji[V{Ji) - G G, and therefore Ji G Wki{G)- By 
the proof of Theorem El for i = Ji is if^j+i'^aiinor free. Since for 

i = 1, . . . , m, A{Ji) < D + k and chain( J^) < C{k' + 1), we apply Theorem 0 
to show that V{Ji) < f{ki, D + 1, D + k, C{k' + 1)). Since J is composed of its 
connected components, |P(J)| < X)i=i m The 
maximum value of |P(J)| is achieved when m = 1, due to the restrictions that 
Vi<i<m ki > 1 and YlT=i as / is a monotonically increasing function 

and hence f{ki) + f{k 2 ) < f{ki + ^ 2 ). We can then conclude that |P(T)| < 
f{k', D + 1,D + k,C{k' + 1)), justifying step 6. 

To determine the complexity of the algorithm, we first observe that steps 1 
through 3 run in 0{\V{G)\{D + k)) time and step 4 in 0(|P(G')|) time. As a 
consequence of property 4 of G, step 5 can be executed in 0{\V{G)\ag) time. 
Step 7 can be implemented by checking whether J[V{J) — S'] G G for all sets 
S' C P(J),|S'| < fc', in 0{ag{f{k',D + l,D + k,G{k' + l))f) time. The overall 
time complexity of the algorithm is thus inO{{D + k + ag)\V{G)\ + ag{f{k,D + 
1, 13 -k fc, C(fc -k 1)))^), as claimed. □ 

As we mentioned at the end of Sectional both A{G) and chain(ob(t/)) are 
bounded by t = max-size(ob(t/)) and hence the complexity of Theorem 0can be 
rewritten as 0((t + k + <yg)\V{G)\ + ag{f{k,t -k l,t -k k,t{k + 1)))^) = 0{{t + 
k + ag)\V{G)\ + ag{0{te{t + kff). 

Suppose now that G satisfies conditions 1-3 of Theorem 0 and that 
ob(t/) is known. As ATi_ 2 i(e)+i G ob(t/), all the graphs in G have pathwidth 
bounded by D jBB.STfll] . Using standard techniques on graphs with bounded 
treewidth one can construct an algorithm that checks whether H < G 

in 0{x{t,\V{H)\)\V{G)\) time, where f is a bound on the treewidth of G 
and X is a super-polynomial function. Consequently, if we know ob(t/), then 
each of the checks for membership in G applied in steps 5 and 7 can be 
done in 0(|ob(^)| • x(73, max-size(ob(t/)))|^|) time and ag can be replaced by 
|ob(5)| • x(^: max-size(ob(t/))). As a result, condition 4 will be satisfied. 

6 Recognizing Graphs within k Vertices of a 
Bonnded-Degree Class 

In this section we extend our techniques to provide a solution to a generalization 
of the vertex cover problem that is not closed under taking of minors. Let Gd 
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denote the class of graphs with maximum degree at most D. We define the 
problem Almost D-bounded graph as deciding whether or not a graph G is in 
y^kiQo), and if so, finding a set S' C y (G) where |S| < /c and G\V (G) — S] G Qd- 
If D = 0, the problem is simply vertex cover. To the best of our knowledge, 
the best algorithm so far for the general problem can be derived as a subcase of 
the Graph Modification Problem | |Cai9(T| . The running time obtained 
is 0{{D + l)^|y(G)p+^). The function given below improves on this bound 
(Theorem EJ but the algorithm which uses it as a subroutine is even faster. 
Function Seardr\jj{G,k) 

1 If fc < 0 then return NO 

2 If Z\(G) > D + k, choose v € V(G) with deg( 5 (u) > D + k 

if Search£i(G — u, fc — 1) =N0 , then return NO 
else return {u} U Search£>(G — t, fc — 1) 

3 If A(G) < D, then return 0 

4 Choose V G V (G) with D < dciv) < D + k . 

(If there exists no such vertex in G then return NO) 

5 If, for some rt G A^g(u) U {u} , Searchi)(G — u, fc — 1) is a set 

then return that set; else return NO 

Algorithm 

1 A-^ {v gV (G) I deg( 5 (T) > D + k} 

2 If |A| > /c then return NO, else iA •<— G[P(G) — A] . 

3 fc' •<— fc — |A| . 

4 F ^{vG V{H) I deg^(u) < D} , C G- V{H) - F 

5 B {v G F \ V adjacent, in FI , to a vertex in G} 

6 If |i? U G| > {D + k + l)k' + {D F k){D + k + l)k' then return NO 

7 return AuSearch£)(77[i3 U G], fc') 



Theorem 4. For Qd the class of graphs with maximum degree at most 
D > 0, function Search£>(G, fc) solves Almost D-bounded graph in time 
0{\V{G)\{D + k)>^+'^). □ 



Theorem 5. For Qjo the class of graphs with maximum degree at most D >0, 
the algorithm above solves Almost D-bounded graph in time 0{\V{G)\{D + 
k) + k{DFkf+^). 

Proof. If G G WkiQo), then there exists a set 5', [S'] < fc where A{G\V (G)— S']) < 
D, and hence all the vertices of G that are not in S have degree at most D + k, 
yielding A C S. If the set A has more than k vertices, then G cannot be a member 
of WkiGo) (steps 1 and 2). It is now easy to see that G G WkiGo) if and only if 
H G yVk-\A\{GD)- Therefore, the problem is reduced to the question of whether 
or not H G Wk'{GD), where A{H) < D + k. The vertices of V{H) that are not 
in BUG have all degree at most D and are adjacent to vertices that have degree 
at most D. As a consequence H G Wk'iGo) if and only if F[[B U G] G yVk'iGo)- 
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The graph H' = H[B \J C] has maximum degree at most D + k and any vertex 
of degree at most D is adjacent to a vertex of degree greater than D in H\ 

We let B' be the vertices in V{H') — S oi degree at most D and let C = 
V{H') — S — B' . Each vertex in C will be adjacent to at least one vertex in 
S as otherwise H'\V{H') — 5] will contain a vertex of degree greater than D. 
Since A{S) < A{H') < k + D, C will contain at most D + k neighbours of 
each vertex in S, or a total of at most k'{D + k) vertices. We have established 
that |C'| + |S'| < {D + k + l)k' . We let J be the vertices of B' that are adjacent 
to vertices in C" U S' and observe that since each vertex in C" U S can have at 
most k + D neighbours in J, | J| < {D + k){\C'\ + |S|) < {D + k){D + k + l)k'. 
Any vertex in B' is adjacent to a vertex in C . Therefore B' — J and \V {H')\ = 
\C'\ + |S| + \B'\ < {D + k + l)k' + {D + k){D + k + l)k' . By Theorem^ step 
7 requires time 0{\V{H')\{D + = 0{k{D + kf+^). Since steps 1-6 can 

be trivially implemented in 0{{D + k)\V {G)\) time, the total running time is as 
claimed. □ 



7 Future Work 

Ideally, we would like to remove the restriction of closure under disjoint union 
from our results. As a consequence of Dinneen’s work ID in na, an obstruction 
of Wk{G) can have at most k + 1 components if C/ is a minor-closed disjoint- 
union-closed graph class. Unfortunately, without this restriction, Q may have 
disconnected obstructions, invalidating the proof of Lemma El It would be an 
interesting result, requiring new techniques, to determine an upper bound on 
the number of connected components in an obstruction of WkiG) when G is not 
a disjoint-union-closed graph class. It would also be nice to use the approach of 
Section Elto find a better algorithm for the general Tli^j^fe-GRAPH Modification 
Problem. 
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Abstract. We show that there exists a linear time algorithm for decid- 
ing whether a graph of bounded tree-width has clique-width k for some 
fixed integer k. 



1 Introduction 

The clique-width of a graph is defined by a composition mechanism for vertex- 
labeled graphs, see fGOOO] , The operations are the vertex disjoint union, the 
addition of edges between all pairs of vertices with a given pair of labels, and 
the relabeling of vertices. The clique- width of a graph G is the minimum num- 
ber of labels needed to define G. Graphs of bounded clique-width are especially 
interesting from an algorithmic point of view. A lot of NP-complete graph prob- 
lems can be solved in polynomial time for graphs of bounded clique-width if the 
composition of the graph is explicitly given. For example, all graph properties 
which are expressible in monadic second order logic with quantifications over 
vertices and vertex sets (MSOi-logic) are decidable in linear time on graphs 
of bounded clique-width, see IGMKUUI . The MSOi-logic has been extended by 
counting mechanisms which allow the expressibility of optimization problems 
concerning maximal or minimal vertex sets, see KiMkool . All these graph prob- 
lems expressible in extended MSOi-logic can be solved in polynomial time on 
graphs of bounded clique- width. Furthermore, a lot of NP-complete graph prob- 
lems which are not expressible in MSOi-logic or extended MSOi-logic like Hamil- 
tonicity and certain partitioning problems can also be solved in polynomial time 
on graphs of bounded clique-width, see |KR,nilWa,nfl4j . 

If a graph G has clique-width at most k then the edge complement G has 
clique-width at most 2k, see mm- Distance hereditary graphs have clique- 
width at most 3, see irTHTTH . The set of all graphs of clique-width at most 2 is 
the set of all labeled cographs. The clique-width of permutation graphs, interval 
graphs, grids and planar graphs is not bounded by some fixed integer k, see 
EEOni- An arbitrary graph with n vertices has clique-width at most n — r, if 
2'' < n — r, see mH]. The recognition problem for graphs of clique-width 
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at most k is still open for /c > 4. Clique- width of at most 3 is decidable in 
polynomial time, see ICHL+00) . Clique-width of at most 2 is decidable in linear 
time, see ICPS85) . 

A famous class of graphs for which a lot of NP-complete graph problems 
can be solved in polynomial time is the class of graphs of bounded tree-width, 
see Bodlaender |Bod98| for a survey. For every fixed integer fc, it is decidable in 
linear time whether a given graph G has tree-width k, see Esnni- All graph 
properties expressible in monadic second order logic with quantifications over 
vertex sets and edge sets (MS02-logic) are decidable in linear time for graphs of 
bounded tree-width by dynamic programming, see |Cou9()| . The MS02-logic has 
also been extended by counting mechanisms to express optimization problems 
which can then be solved in polynomial time for graphs of bounded tree-width, 
see IKUm . 

Every graph of tree-width at most k has clique-width at most 2^+^ -|- 1, see 
fmrp . Since the set of all cographs already contains all complete graphs, the 
set of all graphs of clique-width at most 2 does not have bounded tree-width. In 
[Kiwnn] . it is shown that every graph of clique-width k which does not contain 
the complete bipartite graph for some n > 1 as a subgraph has tree-width 
at most 3fc(n — 1) — 1. 

A simple algorithm to decide a graph property on a graph of bounded tree- 
width can be obtained from a partition of all /-terminal graphs into a finite 
number of equivalence classes, see, for example, An /-terminal graph is 

a graph with a list of / distinct vertices called terminals. Two /-terminal graphs 
G and H can be combined to a graph G o H hy taking the disjoint union of 
G and H and then identifying the i-th terminal of G with the i-th terminal of 
H for !</</. They are called equivalent with respect to a graph property 
n if for all /-terminal graphs J the answer to II is the same for G o J and 
H o J. A graph property II is decidable in linear time on a graph of bounded 
tree-width if there is a finite number of equivalence classes with respect to II 
for all /-terminal graphs and all / > 0. The linear time algorithm first computes 
a binary tree-decomposition T for G and then bottom-up the equivalence class 
for every /-terminal graph G' represented by a complete subtree T' of T. The 
equivalence class of G' defined by subtree T' with root u' is computable in time 
0(1) from the classes of the two /-terminal graphs defined by the two subtrees 
in r - {«'}. 

In this paper, we prove that the graph property “clique-width at most fc” 
divides the set of all /-terminal graphs into a finite number of equivalence classes. 
This implies that there exists a linear time algorithm for deciding “clique-width 
at most fc” for graphs of bounded tree-width. Since every graph of tree-width 
r has clique- width at most 2’'+^ -|- 1, there is also a linear time algorithm for 
computing the clique-width of a graph of bounded tree-width by testing “clique- 
width at most /c” for k = 1,... + 1. The proof^ of lemma IDliQ 

and theorem 0 are omitted due to space limitations. Note that it remains open 
whether the clique-width k property is expressible in MS02-logic and whether 
“clique-width at most fc” is decidable in polynomial time for arbitrary graphs. 

A complete version can be fonnd at www.cs.uni-duesseldorf.de/~wanke. 
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2 Basic Definitions 



We work with finite undirected graphs G = (Vg,Eg), where Vg is a finite set 
of vertices and Eg C \ u,v € Vg, u ^ v} \s & finite set of edges. Graph 

J = (Vj,Ej) is a subgraph of G if Vj is a subset of Vg and Ej is a subset of 
Eg n {{UjU} I rt, u G Vj, u ^ v}. J is an induced subgraph of G if additionally 
S \ u,v G Vj}. To distinguish between the vertices of (non-tree) 
graphs and trees, we call the vertices of the trees simply nodes. 

The notion of clique-width for labeled graphs is defined by Courcelle and 
Olariu in fnnnnj . Let [k] := {1, . . . ,k} be the set of all integers between 1 and 
k. A k-labeled graph G = (Vg, Aq, labc) is a graph (Vg,Eg) whose vertices are 
labeled by some mapping labc : Vg ^ [k]. A labeled graph J = (Vj, Aj, labj) 
is a subgraph of G if Vj C Vg, Ej C Eg H {{u,v} | u, u G Vj, u ^ v} and 
labj(M) = labG(M) for all u G Vj. The labeled graph which consists of a single 
vertex labeled by t G [fc] is denoted by •t. 

Definition 1 (Clique- width, [CQOOJ L Let k be some positive integer. The 
class CWk of labeled graphs is recursively defined as follows. 



1. The single vertex graph for some t G [k] is in CW^. 

2. Let G = (Vg, Eg, labG) G CWk and J = (Vj, Ej, labj) G CWk be two vertex 
disjoint labeled graphs. Then G© J := (V',E', lab') defined by V' := VgUVj, 
E' := Eg U Ej, and 



lab' (a) 



f labG(u) if uGVg 
\ labj{u) if uGVj 



VugV' 



is in CWk. 

3. Let i,j G [k] be two distinct integers and G = (Vg, EG,labG) G CWk be a 
labeled graph then 

a) pi^j{G) := {VG,EG,lab') defined by 



lab' (a) 



J labG(u) if labG(a) ^ i 

iflabG(a) = i 



Vm g Vg 



is in CWk and 

b) Vi,j{G) := {Vg, E' ,labG) defined by 

E' := Eg U {{ m , u } I u,v G Vg, u ^ v, lab{u) = i, lab{v) = j} 



is in CWk. 



The clique-width of a labeled graph G is the smallest integer k such that G G 

CWk. 

An expression X built with the operations •*, ©, pi^j,r]ij for integers t, i,j G 
[k] is called a k-expression or expression for short. To distinguish between an 
expression and the graph defined by the expression, we denote by val(A) the 
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graph defined by expression X. That is, CWfe is the set of all graphs val(X), 
where X is a /c-expression. 

The clique-width of an unlabeled graph G = {Vq, Eq) is the smallest integer k 
such that there is some labeling labc : Vq [fc] such that G' = (Vg, ifc, labc) 
has clique- width k. Since a relabeling of vertices does not change the clique- width 
of the graph, we consider an unlabeled graph as a labeled graph with all vertices 
labeled 1. This allows us to use the notation “graph” without any confusion for 
labeled and unlabeled graphs. 

We next define a so-called normal form for a fc-expression. This normal form 
does not restrict the graphs that can be defined by /c-expressions but helps us 
to prove our main result. 

To keep the definition of our normal form as simple as possible, we enumerate 
the vertices in val(X) for some fc-expression X as follows. If G = val(*t), then 
the single vertex in G is the first vertex of G. Let G = val(Yi © © 2 )- If val(Yi) 
has n vertices and val(Y 2 ) has m vertices, then the i-th vertex of G is the t-th 
vertex of val(yi) if i < n and the {i — n)-th vertex of val(T 2 ) if i > n. The i-th 
vertex of val{r]ij{Y)) and val{pi^j{Y)) is the i-th vertex of val(F). We say two 
expressions X and Y are equivalent, denoted hy X = Y , if val(X) and val(T) 
are isomorphic with regard to the order of the vertices, that is, 

1. val(X) and val(l") have the same number n of vertices, 

2. the i-th vertex in val(X), 1 < i < n, has the same label as the i-th vertex in 
val(y), and 

3. there is an edge between the i-th and j-th vertex in val(X), 1 < i,j < n, if 
and only if there is an edge between the i-th and j-th vertex in val(K). 

Otherwise X and Y are not equivalent, denoted hy X ^Y. 

Definition 2 (Normal form). The normal form for k-expressions is defined 
as follows. 

1. The k-expression •( for some t € [k] is in normal form. 

2. IfYi and I 2 two k-expressions in normal form then the k-expression 



(• ■ ■ (• • ■ (Id © h"2) •••))••• ) 

forii,ji,... ,in,jn,i'iGii - ■ ■ ^ [*] in normal form if the following 

properties hold. 

a) For every edge insertion operation > 1 < < n! , 

(^1 ® ^ 2 ) ^ n © Id, (Id) = Id, (Id) = Id, 

and for every edge insertion operation rjii^,,j'^,, 1 < k' < I', {i;/,J//} yf 
{t'k'jj'k'}- 

b) For every relabeling operation f < I <n, graph 

val{p^,_^^j,_, (• • • {pp ,r (• • • (Id © Id) •••))•• • )) 



has a vertex labeled by ii, a vertex labeled by ji, and ii ^ {j'l, . . . , 
and at least one of the two indices ii and ji is in {i(,j(,... ,i'nf,jn'}- 
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c) If there are two relabeling operations pg^j, and pi^^j^, 1 < I < k < n, 

such that ji = jk then k G {i[,,j[,,... or ik G {i[,,j[,,... , 

i'n'J'n'}- 

d) If pi^^j^, 1 <l <n, is a relabeling operation then 

i. if valfYi) has a vertex labeled by ii then 



(’ ’ ’ ftl— >il ’ ’ Pi'iJi ( © ^2 )■■■))'■') 

^ Pil^jl (■ ■ ■ Pil^jl ( Pn^jl (^l) ffi ^ )■■■))■■■) 



and 

ii. if val{Y 2 ) has a vertex labeled by ii then 



Pil-^jl (■ ■ ■ Pil^jl (■ ■ ■ Pi'iJ'i (^1® ©2)’’’))’’‘) 

^ Pil^jl (■ ■ ■ Pil^jl (' ' ' ® Pil^jl 0^2) )■■■))■■■)■ 



In an expression in normal form a relabeling operation is never done immedi- 
ately before an edge insertion operation. Furthermore, edge insertion operations 
and relabeling operations are done as soon as possible, see 2. (a) and 2.(d). The 
rest of the restrictions are only to avoid redundant composition steps. 

The following lemma shows that every graph of clique- width k can be defined 
by some /c-expression in normal form. 



Lemma 1. For every graph G of clique-width k there is a k-expression for G in 
normal form. 



Here we should give the following remark. Assume an expression 



^ ~ Pin-^jn (■ ■ ■ Pil-^jl 0^) ' ' ') 

is in normal form, where Y = 77 ^/ , (• • • {Z) ■ • • ) for some expression Z . If 

we choose a relabeling operation Pii^j, for some I, 1 < I < n, and substitute all 
labels ii in Y by ji and all labels ji in Y by ii to get an expression denoted by 
Yi^.^j^, then the new expression 

^ ~ Pin^jni.' ’ ’ ■ ■ ■ ) 

is also in normal form and defines the same graph as before, i.e., X = X' . 

Definition 3 (Expression tree). The expression tree T = {Vt,Ex, labx) of a 
k-expression X is an ordered rooted tree whose nodes are labeled by the operations 
of the expression and whose arcs are directed from the sons to the fathers towards 
the root of T . 

The expression tree T of expression consists of a single node r ( the root 
ofT) labeled by •t- The expression tree T of pij{X) and pi^j{X) consists of 
the expression tree T' of expression X with an additional node r ( the root ofT) 
labeled by rjij or Pi^j, respectively, and an additional arc from the root ofT' to 
node r. The expression tree T of Xi (B X 2 consists of the disjoint union of the 
expression trees T\ and T 2 of Xi and X 2 , respectively, with an additional node 
r (the root ofT) labeled by © and two additional arcs from the roots of T\ and 
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T 2 to node r. The root of Ti is the left son of r and the root 0 /T 2 is the right 
son of r. 

A node of T labeled by •t, rjij, Pi^j, or © is called a leaf, edge insertion 
node, relabeling node, or union node, respectively. 

For every expression X, there is a one-to-one correspondence between the 
leaves of the expression tree T = {Vr , Et , labT) of X and the vertices of the 
graph G = (Vg, Sgj labc) = val(X). The i-th vertex in val(X) corresponds to 
the i-th. leaf of T counted from left to right. 

For some node u of T, let T{u) be the subtree of T induced by node u and all 
nodes v of T from which there is a directed path to u in T. Tree T(u) is always 
an expression tree. The expression X' of T{u) defines a (possibly relabeled) 
subgraph G' of G. The vertices of G' are the vertices of G corresponding to the 
leaves of T{u). The edges of G' and the labels of the vertices of G' are defined 
by expression X' which is a sub-expression of X. The subgraph G' of G is also 
denoted by G{T,u) or G{u), if tree T is known from the context. 

Definition 4 (Z-terminal graph). An /-terminal graph is a triple G = (Vq, 
Eq^ Pa) where (Vq, Eq) is a graph and Pc = {xi, ... ,xi) is a sequence ofl>0 
mutually distinct vertices ofVc. The vertices in Pq are called terminal vertices 
or terminals. Vertex xi, !<*</, is called the i-th terminal of G. The vertices 
in Vq — Pq are called the inner vertices ofG. 

The operation o maps two I -terminal vertex disjoint graphs H and J to some 
graph H o J , by taking the disjoint union of H and J , then identifying corre- 
sponding terminals, i.e., for i = 1, . . . ,1, identifying the i-th terminal of H with 
the i-th terminal of J , and removing multiple edges. 

Terminal graphs are also called sourced graphs, see \Knm\ . The composition 
mechanism can easily be extended to labeled graphs under the assumption that 
the i-th terminal of H and the i-th terminal of J have the same label. 

Definition 5 (Replaceability). Let II be a graph property, i.e., II \ Q ^ 
{0, 1}, where Q is the set of all graphs. Two l-terminal graphs Gi and G 2 are 
called replaceable with respect to II , denoted by G\ ^n,i G 2 , if for every l- 
terminal graph H , TI{Gi o H) = II {G 2 o H). 

If ^n.i divides the set of all /-terminal graphs into a finite number of equiv- 
alence classes then II is decidable in linear time for all graphs of tree-width / by 
dynamic programming algorithms. The input for the decision algorithms is a bi- 
nary rooted decomposition tree of width /. Such a decomposition tree of bounded 
width can be computed in linear time for every graph of bounded tree-width, see 
The dynamic programming algorithms simply compute bottom-up for 
every complete subtree with root u the equivalence class of the corresponding 
/-terminal graph from the equivalence classes of the subtrees of the two sons of u. 

3 The Main Result 

In this section, we show that every I > 0, has a finite number of equiv- 

alence classes where TI is the graph property “clique- width at most /c”. In the 
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following, we consider the case where we have two ^-terminal labeled graphs 
H = (Ff/, i?//, Pff, lab/f) and J = {Vj, Ej, Pj,lahj) such that 

G=(VG,EG,lahG) = HoJ 

has clique-width at most k. We partition the vertex set Vg of G into three disjoint 
sets Uh, Uj, Up such that Up^UjUUp = Vg- Vertex set Up — Vh~Ph contains 
the inner vertices from H, vertex set Uj = Vj — Pj contains the inner vertices 
from J, and vertex set U p contains the terminals from H and J. Vertex set 
Up has exactly I vertices because the I terminals of H are identified with the I 
terminals of J . Graph G does not have any edge between a vertex of Up and a 
vertex of Uj. 

Let T = (Ft, -Et, labr) be a fc-expression tree for G = H o J. The subtree 
Tp of T is defined by the I leaves of T that correspond to the I vertices of Up 
and by all nodes of T to which there is a path from these leaves. The root of Tp 
is the root of T. In general, tree Tp is not an expression tree because it does not 
represent a valid expression. 

The outline of the proof is the following. We will show that for each such 
pair H, J, where G = H o J has clique-width at most k there is always at least 
one ^-expression tree T for G such that Tp has a very special form. This special 
form allows us to bound the information how H and J are combined. The size 
of this information will depend only on k and I but not on the size of H or J. 
Let us call this information the connection type of Tp. 

The definition of the connection type will imply the following. If there are 
two systems {Hi, Ji,Ti) and {H2, J2,T2) such that the connection type of T\ 
and the connection type of T2 are equivalent, then the two graphs H\ o J2 and 
H2 o Ji will also have clique- width at most k. If for every system {Hi, Ji,Ti) 
for Hi and some Ji there is some system {H2, J2,T2) for H2 and some J2 such 
that the connection type of Ti and the connection type of T2 are equivalent 
and vice versa, then Hi ^p^^i H2, where H^ is the property of clique- width at 
most k. Since there is only a bounded number of mutually different connection 
types, it follows that there is only a bounded number of equivalence classes for 
all Z-terminal graphs with respect to equivalence relation ^Pf.y. 

Lemma 2 . There is always a k-expression tree T for G that satisfies the fol- 
lowing property. 

Property 1 . Let ui be a union node of Tp such that one of its sons uq is in Tp 
and the other son Uq is not in Tp. Then the vertices of G(ug) are either all from 
Up or all from Uj. 

Proof. Since G{uq) does not contain vertices from Up, we know that the vertices 
of G{u'q) are all from Up U Uj. If the vertices of G{u'q) are not all from Up or 
not all from Uj then let Ti and T2 be the fc-expression trees that define the 
subgraphs of G{u'q) induced by the vertices of Up and Uj, respectively. We 
replace the subtree T{uq) by Ti, insert a new union node vq between ui and uq, 
and make the root of T2 to the second son of vg which is not in Tp . The resulting 
tree obviously defines the same graph as before but ui now satisfies property Q 
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The transformation of an arbitrary expression tree T into normal form as in 
the proof of lemma E (omitted in this version) does not change property Q] of 
T. That is, we can additionally assume that we have a fc-expression tree T for 
G = H o J in normal form which satisfies property Q 

Let Ml be a union node of Tp such that one of its sons uq is in Tp and the 
other son u'q is not in Tp. We define ^{ui) := H or ^(ui) := J if the vertices of 
G(mq) are all from Up or all from Uj, respectively. In all other cases and in the 
case where Ui is not a union node, we say ^(ui) is undefined. By lemma 0 we 
can assume that ^(mi) is defined for all union nodes u± of Tp for which exactly 
one of their sons is in Tp. 

The tree Tp consists of exactly 21—1 maximal paths p = {u\, . . . , Ug'), s' > 1, 
such that Ml is a leaf of Tp or a union node with two sons in Tp and all vertices 
M 2 , . . . , Mg/ of the path have exactly one son in Tp. The last node Ug' is either 
the root of Tp or a son of some union node whose sons are both in Tp. All the 
graphs G{ug) for s = 1, . . . , s' contain the same vertices of Up. Such a path is 
called a 1-path or path of type 1. Every node of Tp is in exactly one of these 
21—1 paths of type 1. 

We divide every 1-path p of Tp into a path of type l.a and a path of type 
1.6. The paths of type l.a are the maximal paths q whose first node mi is a leaf 
of Tp or a union node whose sons u'q and Uq are both in Tp and all other nodes 
of q are edge insertion or relabeling nodes. The remaining parts of Tp are called 
paths of type 1.6. There are exactly 21 — 1 paths of type l.a and at most 21 — 1 
paths of type 1.6. For every union node Mi in a 1.6-path there is either ^(mi) = H 
or C(mi) = J. 

A maximal subpath (mi, . . . , up , . . . , Mg') of a 1.6-path p such that mi is a 
union node, M 2 , ... , m^' are edge insertion nodes, and m^'+i, ... , Mg/ are relabeling 
nodes, is called a frame of p. Every frame has at most ( 2 ) + k nodes, because 
there are one union node Mi, at most (*) edge insertion nodes M 2 , . . . ,up, and 
at most k — 1 relabeling nodes up+i, ... , Mg/. 

For some node Ug of the fc-expression tree T, let Tp(Mg), Lj{ug), and Lp{ug) 
be the sets of the labels of the vertices of G{ug) which are from Up, Uj, and 
Up, respectively. The intersections Lp{ug)C\Lj{ug), Lp{ug)C\Lp{ug), Lp{ug)C\ 
Lj{ug), and Lp{ug) fl Lp{ug) fl Lj{ug) are abbreviated by 

Lpr\j{ug), Lpnp{ug), Lppj{ug), and Lpnpnj{ug), 
respectively. 

Lemma 3. There is always a k-expression tree T for G in normal form which 
satisfies property^ and^ 



Property 2. Let Ug be a relabeling node of Tp labeled by pi^j and let Mg_i be 
the son of Mg. If i G Lp{ug-\) then j G Lp{ug-\). 

Proof. Assume Tp is in normal form and satisfies property ^ Let q = (mi, . . . , 
Up,. . . ,Mg') be a frame of Tp such that M2, . . . ,up are edge insertion nodes, 
and up+i, ... , Mg' are relabeling nodes. Let Mg, r' < s < s', be labeled by 
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and let i G Lp{us-i). We additionally assume that T(its_i) already satisfies the 
lemma. 

If j ^ Lp{us-i) then we replace in subtree T{ur') simultaneously every label 
i by label j and every label j by label i. The resulting subtree T(ur') is still in 
normal form and satisfies property^andj^- The resulting tree T(us) is now also 
in normal form, satisfies property [Hand 0 and defines the same graph as before. 

By lemmaEl we can assume that for every son Ug-i of some relabeling node 
Us of Tp Lp{us) C Lp{us-i). If Lp{us) = Lp(ms_i), then the reverse inclusion 
holds true for the sets Lpph{us) and Lppj{us), i.e., Lpnp(us) D Lpph{us-\) 
and Lpnj{us) D Lpnj(us-i). This allows us to divide every 1.6-path p into 
paths of type 2. a and paths of type 2.6 as follows. The 2.a-paths are the frames 
q = (mi, . . . , Up , . . . , Us') of p for which at least one of the following properties 
holds. 

1. There is some relabeling node Us, r' < s < s', such that \Lp{us)\ < 
\Lp{us-i)\, \Lpr,H{us)\ > |Lpn//(ws-i)|, or \Lpr,j{us)\ > \Lppj{us-i)\. 

2. |Tpnp(ui)| > iLpnnMl or \Lppj{ui)\ > \Lppj{uo)\, where Uq is the son 
of ui in Tp. 

The 2.6-paths are the remaining parts of p. In a 2.6-path p all the sets Lp{us) 
are equal, all the sets Lppp{us) are equal, all the sets Lppj{us) are equal, and 
all the sets Tpnffnj(ws) are equal, for all nodes Us of p including the son uq of 
the first node u\ which is in Tp. 

For a frame q = {ui , ... ,up, ■ ■ ■ , Ug') of a 2.6-path let 

Lp{q) = Tp(wi), LppH{q) = Lpph{u\), Lpr\j{q) = Tpnj(ui). 

We use q instead of some node of q as the argument to emphasize that the sets 
above are equal for all nodes of q including the son uq of the first node of q which 
is in Tp. It is easy to see that for every 1. 6-path p there are at most 3fc paths of 
type 2.6 and at most 3fc — I paths of type 2. a. 

Lemma 4. There is always a k-expression tree T for G in normal form that 
satisfies property^ and additionally property\^ 

Property 3. Let p be a 2.6-path of Tp, and q = (ui, ... ,up, ■ ■ ■ , Ug') be a frame 
of p such that U 2 , ... ,up are edge insertion nodes, Ur'+i , ... ,Us> are relabeling 
nodes, and Ug, r' < s < s', is labeled by pi^j. If i G Lpnjius-i) then j G 
Lpajiui). 

Proof. Assume T is in normal form and satisfies property 0 and |21 Let i G 
Apnj(us-i)- 

If j G Lp{q) then the assumption i G Tpnj(Ms-i) and the relabeling pi^j 
at node Ug imply j G Tpnffnj(Ms) = LppHn.j{q) and thus j G Tpnj(ui). 

If j ^ Lp{q) and j ^ Tpnj(wi) then we replace in subtree T{up) simultane- 
ously every label i by label j and every label j by label i. The resulting subtree 
T{up) is still in normal form and satisfies property 0 |21 and0 The resulting 
tree T{ug') is now also in normal form, satisfies property 0,0 and0 and defines 
the same graph as before. 
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For some node Ug of Tp and some label j G [k] let forbp(us,j) be the set 
of all labels i G Lp{ug) such that graph G{ug) has two non adjacent vertices 
labeled by i and j. If the set forbp(us,j) is empty then either graph G{us) has 
no vertex labeled by j or every vertex of G{us) labeled by j is adjacent to every 
vertex of G{ug) labeled by some label of Lp(us). Note that for every node Ug of 
Tp, set Lp{ug) is non-empty. 

Let Ug be a relabeling node of Tp labeled by Pi^j and let rtg-i be the son of 
Ug. If i ^ Lp{ug_i) then obviously 

forbp(us, j) = forbp(its-i, j) U forbp(us-i, *)• 

Lemma 5. Assume T is in normal form and satisfies property^ andQ Let 
p be a 2.b-path of Tp, let q = (ui, ... ,Ur', . . . ,Ug>) be a frame of p, and let uq 
be the son ofui which is in Tp. If a node Ug, r' < s < s' , is labeled by Pi^j and 
ifi G Tpnj(Ms-i) then f orb p{uo,j) G forbp{ug>,j). 

LemmaEIallows us to divide every 2.6-path p into paths of type 3. a and paths 
of type 3.6 as follows. The 3.a-paths are the frames q = (iti, . . . ,Ur>, ■ ■ ■ ,Ugi) 
of p for which there is some relabeling node Ug, r' < s < s', or some label 
i G Lp{q) U Tpnj(Mo), such that \LHnj{ug)\ yf |Tpnj(us-i)|, \LHnj{'Ui)\ > 
|Tpnj('*^o)l) or |forbp(rts/, i)| |forbp(itO) *)|? where Uq is the son of Ui in Tp. 
The 3.6-paths are the remaining parts of p. In a 3.6-path p all the sets Lnpj^Ug) 
are equal for all nodes Ug of p including the son of the first node which is in Tp. 
For a frame q = (mi, . . . ,up, ■ ■ ■ ,««') of a 3.6-path let LHn,j{q) = LHnj{ug') 
and forbp(( 7 , j) = forbp(us', j). Then for all frames g of a 3.6-path including the 
frame immediately before the first frame of q all the set Lncijid) and forbp(g, j) 
are equal for all j G [A:] . 

Lemma|3 implies that the number of 3.a-paths and 3.6-paths in some 2.6-path 
can be estimated by some constant c depending only on k. 

Lemma 6. Let q = (iti, . . . , up , . . . , Ug/) be a frame of a 3.b-path p such that 
U 2 , ... ,up are edge insertion nodes, up+i^ ■ ■ ■ are relabeling nodes, and Ug, 
r' < s < s' , is labeled by pi^j. 

1. If f{ui) = H then i G Tp(us_i) - Lp{q) - Lj{ug-i) and j G Tp(us-i). 

2- Iff{ui) = J then i G Lj{ug_i) - Lp{q) - Tp(us_i) and j G Lj{ug_i). 

Lemma 7. There is always a k-expression tree T for G in normal form that 
satisfies property^ and additionally property^ 

Property 4- Every 3.6-path p is divided into at most 3 • paths p' such 

that for all frames q = {ui , ... ,up, ■ ■ ■ , Ug>) of p' all ^(ui) are equal. 

Lemma 0 allows us to divide every 3.6-path into paths of type 4. a 

and 2 • 2^^*^+^^ paths of type 4.6. The paths of type 4. a are the frames pi. The 
paths of type 4.6 are the paths p 2 ,i and P 2 , 2 , where for all the frames q of p 2 .i 
all ^{q) are equal and for all the frames q of p 2,2 all f{q) are equal. 



Deciding Clique-Width for Graphs of Bounded Tree-Width 



97 



Let us now summarize how the paths of Tp are partitioned. We have 

1. exactly 21 — 1 paths of type l.a, 

2. at most {21 — 1) • (3fc — 1) paths of type 2. a, 

3. at most {21 — 1) ■ 3k ■ c paths of type 3. a, 

4. at most {21 — 1) ■ 3k ■ c ■ paths of type 4. a, and 

5. at most {21 — 1) ■ 3k ■ c ■ 2 ■ paths of type 4.6. 

Every node of Tp is in exactly one of these paths. The paths of type l.a, 2. a, 

3. a, and 4. a have at most ( 2 ) + k nodes. For all frames q = {u\, . . . ,Us') in a 
path of type 4.6 all ^(ui) are equal, all sets forbp(rts', j) are equal for all j G [k], 
and all sets Lp{q) and LHnj{Q) are equal. 

Definition 6 (Connection type). Assume Tp satisfies property^\^\^ and\^ 
Then the connection type Cp ofTp is a labeled tree obtained from Tp by replacing 
every 4.b-path p = {ui, . . . ,Us>) by some special node v and two edges {ui,v) and 
{v,Us'). Every usual node Us of Tp will be labeled by its clique-width operation 
and additionally by ^{us), Lp{us), Lj{us), Lh{us), and all forbp{ug, j) for all 
j G [k]. Each leaf Us of Tp represents a vertex v of G obtained by joining two 
terminal vertices, one from the l-terminal graph H and one from the Tterminal 
graph J. If v is obtained by joining the i-th terminal of H with the i-th terminal 
of J then leaf Us is additionally labeled by index i. 

Two connection types Cp,i, Cp ^2 are called equivalent if there is a bijection 
6 between the nodes ofCp^i and Cp 2 such that 

1. node Us-i is a son (left son, right son) of node Ug in Cpi if and only if node 
b{us-i) is a son (left son, right son, respectively) of node b{us) in Cp^ 2 , 

2. if node Us ofCp^i is a special node then node b{us) of Cp ^2 is a special node, 

3. node Ug ofCp^i and node b{ug) ofCp ^2 are equally labeled. 



Theorem 1. Let {Hi, Ji,Ti) and {H 2 , J 2 ,T 2 ) be two systems such that 

1. Hi, H 2 , Ji, J 2 are l-terminal graphs, 

2. Hi o Ji and H 2 o J 2 have clique-width at most k, 

3. Ti and T 2 are k-expression trees for Hi o Ji and H 2 o J 2 , respectively, in 
normal form which satisfy property 0 H 0 and 0 

4- and the two connection types Cp^i and Cp ^2 ofTp^i and Tp ^2 are equivalent. 

Then Hi o J 2 and H 2 o Ji have also clique-width at most k. 

By theorem 0 and the fact that there is only a finite number of connection 
types for fixed integers I and k it follows that the equivalence relation has 

a finite number of equivalence classes. For an ^-terminal graph H let C{H) be 
the set of all connection types Cp defined as follows. A connection type Cp is 
in C{H) if there is some /-terminal graph J such that H o J has clique- width k 
and there is a /c-expression tree T for H o J such that Tp is in normal form and 
satisfies property 000 and0 and Cp is the connection type of Tp. If for two 
/-terminal graphs Hi and H 2 the sets C{Hi) and C{H 2 ) are equal, then Hi and 
H 2 are replaceable with respect to the clique-width k property. 
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Abstract. We prove tight and near-tight combinatorial complexity 
bounds for vertical decompositions of arrangements of linear surfaces 
in four dimensions. In particular, we prove a tight upper bound of 0(n^) 
for the vertical decomposition of an arrangement of n hyperplanes in 
four dimensions, improving the best previously known bound 0 by a 
logarithmic factor. We also show that the complexity of the vertical de- 
composition of an arrangement of n 3-simplices in four dimensions is 
0{n^a{n) log n), improving the best previously known bound 0 by a 
near-linear factor. We believe that the techniques used for obtaining these 
results can also be extended to analyze decompositions of arrangements 
of fixed-degree algebraic surfaces (or surface patches) in four dimensions. 



1 Introduction 

Given a collection F of n fixed-degree algebraic surfaces in its arrangement 
0 is denoted by A{F). The complexity of a single cell in A{F) may be as large as 
There exist many geometric algorithms that require arrangements to 
be decomposed into cells of constant description complexity (that is, defined in 
terms of a constant number of polynomial equalities and inequalities of constant 
maximum degree) . As a result, devising a decomposition scheme that decomposes 
arrangements into as few as possible such cells is an intensively studied problem. 

The most efficient general-purpose decomposition scheme is the vertical de- 
composition, which is suitable for arrangements of fixed-degree algebraic surfaces 
is any dimension. We define vertical decompositions in Section 2, and refer the 
reader to 0 for details on the general scheme. The vertical decomposition was 
originally introduced in the context of two-dimensional problems, and was ex- 
tended to higher dimensions in the late 1980’s |3]. The complexity of the vertical 
decomposition of an arrangement of n triangles (or planes) in is known to 
be 0{ri^a{n)\ogn -\- K), where K = 0{iA) is the complexity of the undecom- 
posed arrangement m- However, there is still a substantial gap between the 
known upper and lower bounds for the complexity of vertical decompositions in 
dimensions higher than three. 

* This work was supported by a grant from the Israeli Academy of Sciences (center of 
excellence) . 
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The number of cells in the vertical decomposition of an arrangement of n 
fixed-degree algebraic surfaces in R'* is only known to be 0{n^(3{n)) 0 , where 
/3(n) is an extremely slow-growing function of n (which also depends on d and on 
the maximum degree of the polynomials that define the surfaces of F), related to 
Davenport-Schinzel sequences |^. The problem of improving this upper bound 
(which is larger than the known lower bound by a near-linear factor) has been 
stated as an important open problem numerous times [Ilfl7l8l!?| . but is still open 
for more than a decade. 

Such an improvement has immediate wide-ranging algorithmic applications, 
as it would automatically improve the (asymptotic) running time of many al- 
gorithms that utilize decomposed 4-dimensional arrangements. These include 
algorithms for point location, range searching, robot motion planning, and a va- 
riety of geometric optimization problems (see 0 Section 8.3] and the references 
therein). An interesting result in this direction is that of Guibas et al. |Z], who 
showed that the vertical decomposition of an arrangement of n hyperplanes in 
four dimensions has complexity O(n^logn). 

In this paper, we prove a tight bound of 0{ri^) for the case of hyperplanes, just 
mentioned. Moreover, we bound the complexity of the vertical decomposition of 
an arrangement of n 3-simplices in four dimensions by 0{n^a{n) log n). This 
improves the best previously known upper bound for this setting jS] by a near- 
linear factor. 

We note that an arrangement of simplices as above can be decomposed into 
0(n^) cells by extending the simplices into hyperplanes, and decomposing the 
resulting hyperplane arrangement using the bottom-vertex simplicial decompo- 
sition P]. In light of this special decomposition scheme for linear arrangements, 
the principal significance of this paper is in the fact that it introduces general 
techniques for analyzing vertical decompositions in four dimensions. 



2 Vertical Decompositions in Four Dimensions 

The construction. Denote the coordinates by x, y, z and w. Given a collection 
r of 3-simplices in R'*, the vertical decomposition of A{F), denoted by V(T), is 
constructed as follows. 

For each S G F, erect a 3D ^-vertical visibility wall on the boundary of S (a 
2D piecewise linear surface denoted by dS), which is defined as the union of all 
2 -vertical (visibility) segments that have an end-point on dS and are interior- 
disjoint from all simplices of F. Also, for each pair S,T G F, erect a 3D z-vertical 
visibility wall on the 2D (linear) surface S' fl T in a similar fashion. This results 
in a decomposition of A{F) into (not necessarily convex) z-vertical prisms, such 
that the floor (respectively, ceiling) of each prism, if it exists, is contained in a 
single simplex of F. We denote this decomposition by Vi(T). 

For a prism P of Vi(T'), the simplex containing its floor (respectively, ceiling) 
is denoted by Pp (respectively. Pc)- Projecting P onto its floor or ceiling (in the 
z-direction) results in a 3D polyhedron, denoted by P^d- We first decompose 
PsD by erecting 2D y-vertical visibility walls on each of its edges. For an edge / 
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of P 3 D, the wall erected on it is defined as the union of all y- vertical (visibility) 
segments that have an end-point on / and are fully contained in P^d- We then 
decompose P by erecting z-vertical 3D visibility walls on each such y-vertical 
2D visibility wall of Psd- Repeating this process for all prisms P results in a 
decomposition of Vi(D), which we denote by V 2 {P)- 

We further refine V 2 {P) as follows. For a prism Q of V 2 {P), consider its z- 
projection Qsn (which is a y- vertical prism in R^). Projecting onto its floor 
or ceiling (in the y-direction) results in a 2D polygon Q 2 D, which we decompose 
by erecting 0, 1, or 2 x-vertical (possibly infinite) visibility segments on each 
vertex v of Q 2 D, defined in analogy to the above. We then erect j/- vertical 2D 
walls (inside Qsd) on each such a;- vertical segment of Q 2 D- Subsequently, z- 
vertical 3D walls (inside Q) are erected on each such y- vertical 2D wall of Qsd- 
Repeating this process for each prism Q of V 2 {P) decomposes V 2 {P) into cells of 
constant description complexity: each such cell is a convex polyhedron bounded 
by up to 8 facets. 

This completes the construction of the vertical decomposition V(T). Similar 
constructions can be formulated for arrangements of general algebraic surfaces in 
R^, and for arrangements of simplices or algebraic surfaces in higher dimensions. 
We refer the reader to |E] for more details on the vertical decomposition scheme 
in those more general settings. 



Preliminary analysis. The following two observations will be helpful in ana- 
lyzing the complexity of V{P). First, it is easy to see that the complexity of V(T) 
is asymptotically the same as the complexity of V 2 {P), and it is thus sufficient 
to bound the latter in order to bound the former. Second, the complexity of 
V 2 {P) is asymptotically the same as the complexity of Vi(T) plus the number of 
2 /-vertical visibility events inside all the projection polyhedra P^d of the prisms 
P of Vi(T). (Such an event is said to happen between two edges, / and /', of 
P 3 D if they intersect a common y- vertical line Z, and the segment s C I that 
connects / and f' on this line lies completely inside Psd-) We start by bounding 
the complexity of Vi(F). 

Lemma 1. Given a collection P ofn 3-simplices otR"^, the complexity ofVi{P) 
is 0{n'^a{n)). If P consists of hyperplanes, the complexity ofVi{P) is O(n^). 

Proof. During the construction of Vi(T), a z- vertical visibility wall is erected on 
SnT, for all S,T G P. By construction, this wall is bounded from above by the 
lower envelope of the part of A{P) that lies above S' fl T (within the z-vertical 
hyperplane spanned by S fl T) . Similarly, it is bounded from below by the upper 
envelope of the part of A{P) that lies below S fl T. The complexity of these 
upper and lower envelopes is 0(n^a(n)) 0. Consequently, the complexity of the 
wall erected on S fl T is also 0{n^a{n)). 

The same arguments imply that the complexity of the visibility wall erected 
on each dS (for S G P) is also 0(n^a(n)). Since there are n such boundaries 
dS and 0{n^) such intersections S fl T, the overall number of features that are 
created during the construction of Vi(T) is 0{n'^a{n)). 
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If is a collection of hyperplanes, S' fl T is a 2-dimensional plane (for all 
S,T G r), and the complexity of the above-mentioned envelopes is 0{n^). This 
is because the envelopes are a portion of the zone of SflT within the cross-section 
of A{r) in the vertical hyperplane spanned by S fl T; the complexity of such a 
zone is O(n^) The number of created features in this case is thus 0{n'^). □ 



Visibility events. In light of the above, the bulk of this paper is devoted to 
bounding the number of y- vertical visibility events inside all polyhedra Psd- To 
this end, we first classify the faces and edges of a projected prism P 3 D, obtained 
as the z- vertical projection of a prism P of Vi (T) . Each face of P 3 D is a z- vertical 
projection of a face of P, which is a part of a 3-dimensional z- vertical wall erected 
on a 2-dimensional feature of A{P) (during the construction of Vi(P)). These 
walls can be of three types. 

— An (upward) visibility wall erected on Pp fl S (for some S G P) that touches 
Pc (from below). Faces of P^p that are projections of parts of such walls are 
said to be red. 

— A (downward) visibility wall erected on PcblT (for some T G P) that touches 
Pp (from above). Corresponding faces of P^^p are said to be green. 

— A visibility wall erected on dU (for some U G P) that touches Pp (from 
above) and Pc (from below) . (Intuitively, the boundary of U is partly “float- 
ing” between Pp and Pc and the z-vertical wall erected on it reaches both 
Pp and Pc.) Corresponding faces of Psp are said to be blue. (Note that in 
the case of hyperplanes there are no blue faces.) 

Any edge of Psp is incident to two faces of Psp. Such edges can thus be 
classified into four types, depending on the types of the incident faces. 

1. Edges incident to two red (or two green) faces. The two faces, by definition, 
correspond to parts of the visibility walls erected on Pp fl S' and Pp C\T 
(respectively, on Pc H S and Pc HP), for some S, T G P. An edge incident to 
both of them thus corresponds to the common part of these two walls, which 
is the visibility wall erected on Pp fl S fl T (respectively. Pc H S fl T). We 
will denote such edges mnemonically by P 3 to signify that they are formed 
by an intersection of three simplices. 

2. Edges incident to a red and a green face. Such an edge corresponds to the 
common part of two walls, one erected (upward) on Pp fl S and another 
erected (downward) on Pc H P, for some S, P G P. This part is composed 
of z-vertical segments that touch both Pp fl S and Pc fl P. We will denote 
such edges by P 22 to signify that they are formed by interaction of two 
intersections, each of two simplices. 

3. Edges incident to a red (or green) face and a blue face. Such an edge corre- 
sponds to the common part of two walls, one erected on Pp fl S (respectively, 
on Pc n S) and another erected on dT, for some S,T G P. This part is com- 
posed of z-vertical segments that touch Pp fl S' from above (respectively. 
Pc n S from below), pass through dT and touch Pc (respectively, Pp). We 
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will denote such edges by £'21 to signify that they are formed by an inter- 
section of two simplices, and by a boundary of a third simplex. 

An interesting special case is the one in which S = T, i.e. one face corre- 
sponds to Pp n S' or Pc n S, and another to dS. Such edges correspond to 
a visibility wall erected on Pp fl dS or Pc H dS, and will be denoted by £2- 
4. Edges incident to two blue faces. Such an edge corresponds to the common 
part of the visibility walls erected on dS and dT, for some S, T S £. We will 
denote such edges by £n to signify that they are formed by two boundaries 
of simplices. 

An interesting special case is the one in which S = T. In this case the edge 
is incident to two blue faces that correspond to walls erected on two incident 
2-dimensional features of dS, and it therefore corresponds to a visibility wall 
erected on a 1-dimensional feature of dS. Such edges will be denoted by £1. 

To recap, the possible mnemonic representations of the edges of P^d are £3, 
£22, £21, £2 (a special case of £21), £11, and £1 (a special case of £n). 

Each y-vertical visibility event inside P^d corresponds to a y- vertical segment 
s that lies completely inside Psp, such that s connects a point p on an edge of P^p 
to a point p' on another edge of Pzd- Notice that p (as well as p') is a z- vertical 
projection of a specific z- vertical segment e (respectively, e'), whose bottom end- 
point lies on Pp, and whose top end-point lies on Pc- By construction, e and e' 
lie inside a common yz-parallel (2-dimensional) plane, which we denote by Pe.e'- 
The 2-dimensional feature of V2(P) that corresponds to s (that is, the “wall” 
that is erected on s inside P) is the trapezoid that has e and e' as its bases. This 
trapezoid is necessarily disjoint from P in its interior. We will sometimes denote 
a y-vertical visibility event by (e, e'), where e and e' are as above, and where the 
points on e have a smaller y-coordinate than the points on e' . 

Fig. HI provides an exhaustive visual catalogue of the possible types of y- 
vertical visibility events. There are 10 such types, determined by the type of 
edge that contains the point p and the type of edge that contains the point p' . 
For example, one such type of events is E3E22, in which the edge containing p 
is of type £3 and the edge containing p' is of type £22 (or vice versa; this type 
can also be denoted by E22E3). For each type of events, the figure shows one 
representative configuration of simplices inside the plane Pe,e'- Notice that each 
event is “defined” by six simplices — Pp, Pc, and four others. 

Fig. [D does not show events that involve £2 and £1 edges. Since these types 
of edges are special cases, the number of such events will be bounded by the 
analysis of events that involve the more general £21 and £n edges. 

Three types of events — E^E^, E21E3, and E21E21 — have alternative con- 
figurations, as illustrated in Fig. |21 These simpler configurations are not special 
cases of the configurations that are shown in Fig.Q(a), Fig.HJd), and Fig.^Jf), 
respectively. Nevertheless, it turns out that the events portrayed in Fig. H] are 
much harder to treat, and our analysis of their number can also be applied, in 
simplified form, to bounding the number of events that are in the alternative 
configurations. We will thus not treat these configurations explicitly. 
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(a) - (Eg ,Eg ) 




(d) - (Egx ,Eg) 



(b) - (Eg ,Eg 2 ) 





(c) - (Egg >Egg ) 




Fig. 1 . All the possible types of j/-vertical visibility events (up to symmetry). 



3 Visibility Events in Arrangements of Hyperplanes 

Analyzing vertical decompositions of arrangements of hyperplanes is simpler 
than in the (more general) case of 3 -simplices, since hyperplanes do not have 
boundaries, and thus only three kinds of y-vertical visibility events can occur. 
Using the notation introduced in the previous section, these are E^Eg, EgE22 
and E22E22 events. In order to prove the main result of this section, stated 
below, it is therefore sufficient to analyze only events of these three kinds. 

Theorem 1. The number of cells in the vertical decomposition of an arrange- 
ment of n hyperplanes in four dimensions is O(n^). 

In the following sequence of three lemmas, we analyze each kind of events 
in turn, and prove that there can only be 0 {n^) events of each kind. This is 
accomplished by charging each event to features of A{E) or of Vi(T), or to 
events that were analyzed previously. 

Lemma 2. The number of Eg Eg events in an arrangement of n hyperplanes in 
4-space is O(n^). 

Proof. For any three hyperplanes Pp,S,T G P, we slide a point a on the line 
Pp r\ S (IT at constant speed (in any of the two possible directions, say in the 
positive direction of the rc-axis, from —00 to +00). Consider a 2 -dimensional 
yz-parallel plane Ida that is attached to the point a, so that it contains a at all 
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(a) - (Eg ,Eg ) (b) - (Egi ,Eg ) (c) - (Egj jEgi) 

Fig. 2. Alternative (easier to analyze) confignrations of some vertical visibility events. 



times during the sliding. The plane TTq “sweeps” (a part of) A{F) in a fixed 
direction at a fixed speed. Thus, it contains a dynamic arrangement of lines, 
such that each line moves with a fixed speed, and the slope of each line is fixed. 
Each such moving line corresponds to a hyperplane; two moving lines intersect 
in a moving point, corresponding to an intersection of two hyperplanes, that 
also moves along a linear trajectory with a fixed speed. At discrete moments in 
time, during the sliding, three lines intersect in a point — such “critical events” 
correspond to intersections of three hyperplanes that are swept by Ua- Three 
of the moving lines are always in degenerate configuration, as they intersect in 
a point at all times. These are the lines (that correspond to) Pp, S, and T. 
(Abusing the notation slightly, we will denote X fl Ua, for X G P, simply by X, 
and similarly for any other feature.) 

During this sliding, we associate each EgEg event that involves Pp r\ S HP 
with a pair (U,A), where U G P and A is an edge in the (4-dimensional) zone 
of U. Since we go over all triples {Pp,S,T) in this fashion, all EgEg events in 
V 2 {P) will be associated with such pairs at the end of the process. We will prove 
that each such pair can only be charged (i.e. associated with) a constant number 
of times, which will imply that the number of EgEg events is asymptotically the 
same as the number of such pairs in the whole arrangement. The latter number 
is 0{rA), since there are n hyperplanes C/, and the zone of each contains 0{tA) 
edges A 0. 

Consider one EgEg event (e, e') that involves Pp H S (IT. Assume, without 
loss of generality, that the bottom end-point of the segment e lies on Pp fl S' fl T, 
and the top end-point of e' lies on an intersection of three other hyperplanes of P, 
Pc n n V, such that an upward z- vertical ray obtained by extending e upwards 
hits U before hitting V (see Fig.CJa)). By definition, there is a specific moment 
in time (denote it by to) during the sliding, at which the plane PI a coincides with 
the plane PIe,e' ■ At time to -be, where e > 0 is arbitrarily small, the j/-coordinate 
of the point Pq fl E is either smaller or larger than the y-coordinate of the point 
Pc n U. These two possibilities distinguish between Eg Eg events of type A and 
type B, respectively; see Fig. 0 

Suppose the event (e, e') is of type A. Notice that, immediately after time to, 
U is “separated” (within II a) from Pp (1 S (IT hy Pc and V, in the sense that 
U does not intersect the wedge bounded by Pc and V that contains PpdSdT 
at time to- Due to linearity of the trajectories, this implies that no event of type 
EgEg involving both PpdSdT and U can happen after time to while PpdSdT 
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Fig. 3 . The types of E3E3 and E3E22 events, (a) and (b) show E3E3 events of types A 
and B, respectively, while (c) and (d) similarly show E3E22 events of types A and B. 
All illustrations show a representative conhguration inside the plane Ua at time to + £• 



lies inside this wedge. Notice that PpC^SHT cannot cease to lie inside this wedge 
before intersecting either Pc or V. Such intersection corresponds to a vertex of 

A{P). 

Let A C Pp n S' n T be the edge of A{P) that contains the bottom end-point 
of e, at the time (e, e') materializes. The preceding argument implies that no 
E3E3 event involving A and U can happen after time to- We associate the event 
(e, e') with the pair {U, A). Uniqueness of this association is ensured by the fact 
that, after the first such charge is made to some pair {U, A) (during sliding on 
the edge A), no other E3E3 event of Type A involving U and A can occur. 

Events (e, e') of type B are handled by a fully symmetric charging argument, 
in which the direction of the sliding of a is reversed. We have thus shown that 
each E3E3 event can be associated with a pair (U,A) as described above, such 
that each such pair ([/, A) is associated with at most one event of type A and at 
most one event of type B. This completes the proof of the lemma. □ 



Remark 1 . A C?3_3 event is said to happen between two points, p € Pp fl 5 ” fl T 
and p' G Pc n U n U, for some Pp,Pc, S, T,U,V G P, if p and p' lie inside a 
common y2;-parallel plane, and p and p' belong to the same cell of A{P); see 
Fig. la). The proof of Lemma 1 can be modified in a straightforward fashion 
to bound the number of G3 3 events by 



Lemma 3 . The number of E3E22 events in an arrangement of n hyperplanes 
in 4-space is 

Proof. For any Pp, Pq, S,T G P, we slide a z- vertical segment a, such that 
its bottom end-point lies on Pp fl S' fl T and its top end-point lies on Pq, at 
constant speed (in any of the two possible directions), with the yz-parallel plane 
Ua attached to it. 

During the sliding, each P3P22 event (e, e') that involves a will be associated 
with either a vertex of A{P), or a feature of Vi(P), or a G3_3 event. Since we 
go over all quadruples (Pp,Pc,S,T) in this fashion, all P3P22 events in V2(P) 
will be associated with such features at the end of the process. The complexity 
of - 4 (P) is Lemmal bounds the number of features of Vi(P) by O(n^), 

while Remark 1 states that the number of events is also 0 {n'^). As we will 
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Fig. 4. Generalized visibility events, (a) shows a G3,a event, while (b-d) show three 
Gs, 22 events. Notice that in the case of Gs,3 and G3,22 events, as opposed to E3E3 and 
E3E22 events, the trapezoid that has e and e' as its bases is not necessarily empty. 



show, each such feature can only be charged (i.e. associated with) a constant 
number of times, which implies that the number of E3E22 events is also 0{n^). 

Consider one E3E22 event (e, e'), such that a coincides with e at time to- (All 
E3E22 events (e, e'), such that a coincides with e', can be handled analogously.) 
Suppose that the bottom end-point of e' lies on Pp nU and the top one lies on 
Pc n V, for some U,V £ P. The y-coordinate of the point Pp (lU (inside the 
plane II a) at time to + £ is either smaller or larger than the y-coordinate of the 
point Pc n V. These two possibilities distinguish between E3E22 events of type 
A and type B, respectively, see Fig. El Assume that the event (e,e') is of type 
A. The other case can be handled by a symmetric argument with the direction 
of the sliding reversed, as in the proof of Lemma 0 

Claim. No E3E22 event (g, g') can occur, such that the segment a coincides with 
g, after time to and before some (moving) line X (corresponding to a hyperplane 
X £ P, which may be U) intersects either ( 1 ) the top end-point of a on Pc 
(Fig. EKa)), or ( 2 ) the bottom end-point of a on Pp H ^ fl T (Fig. EJb)), or ( 3 ) 
Pc nV (Fig. 0 (c)). 

Proof. After time to and before any of the above events happens, the relative 
interior of the segment connecting Pc H V and the top end-point of a is disjoint 
from all hyperplanes of P (other than Pc). Therefore, in any P3P22 event {g, g') 
as specified in the claim, the top end-point of g' has to lie on Pc H V (while 
the bottom end-point has to lie on Pp). However, in this case g' is necessarily 
intersected by U, which prevents {g,g') from being an P3P22 event. □ 

The claim implies that we can associate the event (e, e') with the first time 
(after time to) when some line X causes one of the three events listed in the 
claim to happen. Notice that an event of type ( 3 ) is a o event, and that an 
event of type ( 2 ) is a vertex of A{P). Notice also that after time tg and prior to 
any of the events specified in the Claim, the segment a is disjoint in its interior 
from all hyperplanes of P. This ensures that any event of type ( 1 ) corresponds 
to a feature of Vi(P). Thus, we can associate the event (e, e') with a event, 
or a vertex of A{P), or a feature of Vi(P). As in the proof of Lemma El each of 
these events can be charged at most twice, once for each direction of the sliding 
of a. This implies that the number of P3P22 events is □ 
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Fig. 5 . The three types of events that are associated with E3E22 events and occnr at 
a later time. 



Remark 2 . A Gs^22 event is said to happen between three points, pi G PpDSnT, 
P2 G n U, and ps € Pc r\ V, for some Pp,Pc, S, T,U,V G P, if all three points 
lie inside a common j/z-parallel plane, the segment (pi, P2) is disjoint from all 
hyperplanes of P other than Pp, and the segment (p2, Ps) is z-vertical and is 
disjoint from all hyperplanes of P ; see Fig. Elb-d). By combining ideas from the 
proofs of Lemmas 0 and 01 the number of Gs^22 events can be bounded by 
We omit the details due to space limitations. 

Lemma 4. The number of E22E22 events in an arrangement of n hyperplanes 
in 4 -space is 0 (n‘*). 

Proof. For any four hyperplanes Pp, Pci S,T G P, we slide a z- vertical segment 
a, such that its bottom end-point lies on Pp n S and its top end-point lies on 
Pc n T, at constant speed (in any of the two possible directions), with the yz- 
parallel plane Ea attached to it, as above. Assuming general position, the locus 
of points on Pc H T that are intersected by the z- vertical hyperplane spanned by 
Pp n S' is a line. This implies that the trajectory of a is linear, and TTq contains 
a dynamic arrangement of lines, as in the proof of Lemma 01 

Consider one E22E22 event (e,e'), such that a coincides with e at time to- 
Suppose that the bottom end-point of e' lies on Pp fl U and the top one lies 
on Pc n V, for some U,V G P. Assume, without loss of generality, that the 
y-coordinate of the point Pp fl U (inside the plane 77 q) at time to + £ is smaller 
than the y-coordinate of the point Pc r\V (see Fig.EKa)). 

Similarly to the claim in the proof of Lemma O, no E22E22 event {g,g') 
can occur, such that the segment a coincides with y, after time to and before 
some (moving) line X (corresponding to a hyperplane X G P, which may be 
U) intersects either ( 1 ) the top end-point of a on Pp fl T (Fig. 0 (b)), or ( 2 ) the 
bottom end-point of a on Pp fl S' (Fig. lOJc)), or ( 3 ) Pc fl F (Fig. 0 d)). 

We can thus associate the event (e, e') with the first time some line X causes 
one of these three events to happen. Notice that an event of type ( 3 ) is a G^^22 
event, while the first (in time) event of type ( 1 ) or ( 2 ) corresponds to a feature of 
Vi(P). Thus, we can associate the event (e, e') with a 03,22 event or a feature of 
Vi(P). As above, none of these events is charged more than twice, which implies 
that the number of E22E22 events is Ofnf). □ 

Lemmas 0^^1 bound the number of all y-vertical visibility events by O(n^), 
thereby implying Theorem 0 
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Fig. 6. The analysis of E22E22 events, (a) illustrates the configuration inside 77a at 
time to + e; (b-d) depict the three types of events that are associated with E22E22 
events and occur at a later time. 



4 Visibility Events in Arrangements of Simplices 

The analysis in the case of simplices is considerably more complex and technical 
than in the case of hyperplanes, since all of the 10 types of events illustrated in 
Fig.in can occur. In this case we have: 

Theorem 2. The number of cells in the vertical decomposition of an arrange- 
ment of n 3-simplices in four dimensions is O {n‘^a{n) log n). 

The theorem is proved in two stages. In the first stage we analyze all types of 
events, with the exception of E 3 E 3 events, and prove that the number of events 
of each type is 0{n'^a{n)). This stage uses ideas similar to those introduced in 
the previous section. That is, each event is charged to a feature of A{E), or 
of Vi(T), or to an event of a type that was analyzed previously. Special care 
is taken to ensure that each such feature or event is only charged at most a 
constant number of times. 

For establishing the stated complexity bound, we cannot charge events to 
pairs {U, A), where U is a, simplex of E and A is an edge in the zone of U, as 
we did in the proof of Lemma 0 The reason is that the zone of a 3-simplex in 
an arrangement of n 3-simplices in 4-space is only known to have complexity 
0 {n^ logn) 1 ™ . and not 0{n^), as in the case of hyperplanes. This implies 
that the overall number of pairs (U,A) as above is O(n^logn). This, however, is 
not good enough, due to the use of the Tagansky technique in the second stage 
of the proof (see below), which adds a logarithmic overhead factor to the bound 
that is proved in the first stage. Thus, to achieve the asserted 0{n‘^a{n)logn) 
overall bound, we have to prove a bound of 0{n'^a{n)) in the first stage. 

Accordingly, a charging scheme similar to those used in Lemmas 0 and 0 is 
presented for most types of events. Unfortunately, the charging schemes are much 
more involved, and so is their analysis. Due to space limitations, all the details 
are omitted from this extended abstract, and will appear in the full version of 
the paper. 

In the second stage of the proof, we bound the number of E 3 E 3 events using 
the technique introduced by Tagansky for analyzing substructures in arrange- 
ments of linear surfaces m- Details are again omitted due to space limitations, 
and we only describe the general idea here. 

A charging scheme is presented, such that each E 3 E 3 event is charged either 
to a feature of A{E), or to a feature of Vi(T), or to an event of one of the 
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types that were analyzed in the first stage, or to a 1 -level E3E3 event (see m 
for definitions). To utilize the Tagansky technique, we carefully bound the num- 
ber of times each 1-level E^E^ event is charged in this fashion. In particular, 
we prove that although each E3E3 event makes 4 ‘units’ of charge, each 1-level 
E3E3 event is charged by at most 2 such units. As a result, we obtain a recur- 
rence for the number of E3E3 events, which does solve to the asserted bound of 
0{n^a{n) logn), thus completing the proof of Theorem 0 

5 Conclusion 

We have presented improved upper bounds for vertical decompositions of ar- 
rangements of hyperplanes and 3-simplices in four dimensions. The current fo- 
cus of our work is to utilize the ideas and techniques introduced in this paper to 
prove a near-quartic upper bound on the complexity of vertical decompositions of 
arrangements of fixed-degree algebraic surfaces in four dimensions. Such a result 
appears to be attainable, and would have significant algorithmic applications. 
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Abstract. We make a connection between classical polytopes called 
zonotopes and Support Vector Machine (SVM) classifiers. We combine 
this connection with the ellipsoid method to give some new theoretical 
results on training SVMs. We also describe some special properties of 
C-SVMs for C oo. 



1 Introduction 

A statistical classifier algorithm maps a set of training vectors — positively and 
negatively labeled points in — to a decision boundary. A Support Vector Ma- 

chine (SVM) is a classifier algorithm in which the decision boundary depends 
on only a subset of training vectors, called the support vectors HS|. This limited 
dependence on the training set helps give SVMs good generalizability, meaning 
that SVMs are resistant to overtraining even in the case of large d. Another key 
idea associated with SVMs is the use of a kernel function in computing the dot 
product of two training vectors. For example, the usual dot product v-w could be 
replaced by k(v, w) = (v-w)^ (quadratic kernel) or by k(v, w) = exp(— 1|?; — w|p) 
(radial basis function) . The kernel function jl 4j in effect maps the original train- 
ing vectors in into a higher-dimensional (perhaps infinite-dimensional) feature 
space ; a linear decision boundary in then determines a nonlinear deci- 
sion surface back in For good introductions to SVMs see the tutorial by 
Burges or the book by Cristianini and Shawe-Taylor 

The basic maximum margin SVM applies to the case of linearly separable 
training vectors, and divides positive and negative vectors by a farthest-apart 
pair of parallel hyperplanes, as shown in Figure H( a). The decision boundary it- 
self is typically the hyperplane halfway between the boundaries. Computational 
geometers might expect that the extension of the SVM to the non-separable 
case would divide positive and negative vectors by a least-overlapping pair of 
half-spaces bounded by parallel hyperplanes, as shown in Figure [D)b). This gen- 
eralization, however, may be overly sensitive to outliers, and hence the method 
of choice is a more robust soft margin classifier, called a C-SVM j4ll ?S) or i/- 
SVM 1 1 3j depending upon the precise formulation. Parameter C is a user-chosen 
penalty for errors. 
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Computing the maximum margin classifier for n vectors in amounts to 
solving a quadratic program (QP) with about d variables and n linear con- 
straints. If the feature vectors are not explicit (that is, kernel functions are 
being used) , then the usual Lagrangian formulation gives a QP with about n + d 
variables and linear constraints. Similarly, the soft margin classifier — with or 
without explicit feature vectors — is computed in a Lagrangian formulation with 
about n + d variables and linear constraints. The jump from d to n + d variables 
can have a great impact on the running time and choice of QP algorithm. Recent 
results in computational geometry m give fast QP algorithms for the case of 
large n and small d, algorithms requiring about 0{nd) + (log n) exp(0(\/d )) 
arithmetic operations. The best bound on the number of arithmetic operations 
for a QP with n + d variables and constraints is about 0{{n + d)^L), where L 
is the precision of the input data ra- 
in this paper, we show that the jump from d to n + d is not necessary for 
soft margin classifiers with explicit feature vectors. More specifically, we describe 
training algorithms with running time near linear in n and polynomial in d and 
input precision, for two different scenarios: C set by the user and C — >■ oo. The 
second scenario also introduces a natural measure of separability of point sets. 
Our algorithms build upon a geometric view of soft margin classifiers m and 
the ellipsoid method for convex optimization. Due to their reliance on explicit 
feature vectors and the ellipsoid method, and also due to the fact that SVMs are 
more suited to the case of moderate n and large d than to the case of large n and 
small d, our algorithms have little practical importance. On the other hand, our 
results should be interesting theoretically. We view the soft margin classifier as 
a problem defined over a zonotope, a type of polytope that admits an especially 
compact description. Accordingly, our algorithms have lower complexity than 
either the vertex or facet descriptions of the poly topes. 




Fig. 1. (a) The maximum margin SVM classifier for the separable case. The dashed fine 
shows the decision boundary, (b) The most natural generalization to the non-separable 
case is not popular. 
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2 SVM Formulations 



We adopt the usual SVM notation and mostly follow the presentation of Bennett 
and Bredensteiner The training vectors are xi, 2 : 2 , . . . , a;„, points in K'^. The 
corresponding labels are yi, y 2 , ■ ■ ■ , Un, each of which is either +1 or —1. Let 
1+ = {i I yi = +1} and I- = {i \ yi = —1}. We use w and x to denote vectors 
in and b to denote a scalar. We use the dot product notation w ■ x, but in 
this section w ■ x could be standing in for the kernel function k(w, x). 

In the maximum margin SVM we seek parallel hyperplanes defined by the 
equations w ■ x = &+ and w ■ x = b- such that w ■ Xi < b- for all i € I- and 
w-Xi > &+ for all i G/+. The signed distance between these two hyperplanes — the 
margin — is and hence can be maximized by minimizing \\w\\^ — {b+ — b-) . 

min ||w|p — (6-|_ — &_) subject to (1) 

w, 6_|_, &_ 

Xi ■ w > b^ for j G , Xi ■ w < 6_ for i £ I_. 

A popular choice for the decision boundary is the plane halfway between the 
parallel hyperplanes, w ■ x = (6+ + 6_)/2, and hence each unknown vector x is 
classified according to the sign of w • a; — (&+ + &_)/2. 

In the linearly separable case, we can set &_|_ = 1 — b and 6_ = —1— & (thereby 
rescaling w) and obtain the following optimization problem, the standard form 
in most SVM treatments |5|- 

min ||w||^ subject to (2) 

w, b 

Xi ■ w + b > 1 for jG/+, Xi ■ w + b < —1 for i £ I- . 



Notice that this QP has d + 1 variables and n linear constraints. At the solution, 
w is & linear combination of Xi’s, 2/|| w|| gives the margin, and w ■ x + b — 0 gives 
the halfway decision boundary. 

The dual problem to maximizing the distance between parallel hyperplanes 
separating the positive and negative convex hulls is to minimize the distance 
between points inside the convex hulls. Thus the dual in the separable case is 
the following. 



min 

a; 



aiX^ 

iei+ 



iei- 



subject to 0 < < 1, 



E' 

iei+ 



= 1 , 



i€l- 



= 1 . 



(3) 

Karush-Kuhn- Tucker (complementary slackness) conditions show that the op- 
timizing value of w for CD is given by the optimizing values of for 
w = X)ie/+ ~ vectors Xi with > 0 are called the support 

vectors. 

The soft margin SVM adds slack variables to formulation (P), and then pe- 
nalizes solutions proportional to the sum of these variables. Slack variable ^i 
measures the error for training vector Xi, that is, how far Xi lies on the wrong 
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side of the parallel hyperplane for Xi’s class. 

n 

min II + (&+ — &_) + subject to (4) 

+ i=0 

> 0 V«, Xi ■ w > for i € I+, Xi ■ w < b- + for i G I- . 

The standard C-SVM formulation again sets b+ = 1 — b and &_ = — 1 — 

n 

min ||w;|p + C subject to (5) 

1=0 

> 0 Vi, Xi ■ w + b > 1 — for i G /+, Xi ■ w + b < —1 + for i G I- . 



In formulationEl the decision boundary is w-x = b. Formulation (0, however, 
does not set the decision boundary, but only its direction. Crisp and Burges 0 
write that because “originally the sum of ^i’s term arose in an attempt to ap- 
proximate the number of errors” , the best option might be to run a “simple line 
search” to find the decision boundary that actually minimizes the number of 
training set errors. 

The dual of formulation in the separable case minimizes the distance 
between points inside “reduced” or “soft” convex hulls m- 



iei+ 



E' 

i^I- 



subject to 0 < < /X, 



E' 

iG/+ 



= 1 , 



iG/_ 



= 1 . 



( 6 ) 

See Figure 0 The reduced convex hull of points Xi, i G I+, is the set of convex 
combinations of aiXi with each ai < fi. (Notice that in (0 there is no reason 
to consider fi > 1.) We shall say more about reduced convex hulls in the next 
section. 




Fig. 2. (a) Soft margin SVMs maximize the margin between reduced convex hulls, (b) 
Although the soft margin is often explained as a way to handle non-separability, it can 
help in the separable case as well. 
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The dual view highlights a slight difference between formulations 0 and ©• 
Formulation © allows the direct setting of the reduced convex hulls. Parameter 
/i limits the influence of any single training point; if the user expects no more than 
four outliers in the training set, then an appropriate choice of fi might be 1/9 in 
order to ensure that the majority of the support vectors are non-outliers. If the 
reduced convex hulls intersect, the solution to is the least-overlapping pair of 
half-spaces, as in Figure ^b). Formulation @ is also always feasible — unlike the 
standard hard margin formulation 0 — but it never allows the reduced convex 
hulls to intersect. As C — >■ oo the reduced convex hulls either fill out their convex 
hulls (the separable case) or continue growing until they asymptotically touch 
(the non-separable case). 



3 Reduced Convex Hulls and Zonotopes 



Assume 0 < /r < 1 and define the positive and negative reduced convex hulls by 



H+f, = { X! I X! ^ 0 < Oi < 

iei+ iei+ 

H_^ = I ^ a^Xi I ^ Oj = 1, 0 < Qfj < ^|. 

iG/_ iG/_ 

Figure!^ shows the reduced convex hull of three points Xi, X 2 , and X 3 for various 
values of /r. The reduced convex hull grows from the centroid at ^ = 1/3 to the 
convex hull at ^ = 1; for ^ < 1/3 it is empty. In Figure Q, /r is a little less than 
1 / 2 . 

A reduced convex hull is a special case of a centroid polytope, the locus of 
possible weighted averages of points each with an unknown weight within a 
certain range 0. For reduced convex hulls, each weight Ui has the same range 
[0, /i] and the sum of the weights is constrained to be 1. In 0 we related centroid 
polytopes in to special polytopes, called zonotopes, in We repeat the 

connection here, specialized to the case of reduced convex hulls. 

Let Vi denote (xi,l), the vector in that agrees with Xi on its first d 

coordinates and has 1 as its last coordinate. Define 

0 < ai < /r |. 



Z+li — 



CtiVi 



Poly tope is a Minkowski surr0 of line segments of the form 5^ = { 0 ^?;^ | 0 < 
CH ^ M }■ The Minkowski sum of line segments is a special type of convex poly- 
tope called a zonotope m- Polytope H+f_i is the cross-section of with the 
{d + l)-st coordinate (which by construction is also equal to one. Of 

course, can also be related to a zonotope in the same way. The following 
lemmas state the property of zonotopes and reduced convex hulls that underlies 
our algorithms. Lemma El is implicit in Keerthi et al.’s iterative nearest-point 
approach to SVM training |3j. 

The Minkowski sum of sets A and B in is {p + q \ p G A and q G B}. 
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Fig. 3. The reduced convex hull of 3 points ranges from the centroid to the convex 
hull. 



Lemma 1. Let Z he a zonotope that is the Minkowski sum of n line segments 
in There is an algorithm with 0{nd) arithmetic operations for optimizing a 
linear function over Z. 

Proof. Assume that we are trying to find a vertex v in zonotope Z extreme in 
direction w, that is, that maximizes the dot product w ■ v. Assume that Z is 
the Minkowski sum of line segments of the form Si = {aiVi \ 0 < ai < fi}, 
where Vi G We simply set each independently to 0 or p, depending 

upon whether the projection of Vi onto w is negative or positive. □ 

Lemma 2. There is an algorithm with 0{nd) arithmetic operations for opti- 
mizing a linear function over a reduced convex hull of n points in R*^. 

Proof. Assume that we are trying to find a vertex x in zonotope extreme 
in direction w. Order the xfs with yi = -\-l according to their projection onto 
vector w, breaking ties arbitrarily. In decreasing order by projection along w, 
set the corresponding a^’s to p until doing so would violate the constraint that 
oii = 1. Set ai for this “transitional” vector to the maximum value allowed 
by this constraint, and finally set the remaining afs to 0. Then x = X)iG/+ 
maximizes w ■ x. □ 

An interesting combinatorial question asks for the worst-case complexity of 
a reduced convex hull The vertex x of that is extreme for direction 

w can be associated with the set of xfs for which ai>0. If^=l/fc, then as 
in Lemma El x’s set is the first k points in direction w, a set of k points that 
can be separated from the other n — k points by a hyperplane normal to w. And 
conversely, each separable set of k points defines a unique vertex of Hence 
the maximum number of vertices of is equal to the maximum number of 
k-sets for n points in R*^, which is known to be ui{n‘^~^) and o(n jncni. In 0 
we showed that a more general centroid polytope in which each point Xi has ai 
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between 0 and fXi (that is, different weight bounds for different points) may have 
complexity 0{n‘^). 

We can also apply the argument in the proof of Lemma El to say something 
about the optimizing values of the variables in (0) and o for the non-separable 
case. (Alternatively we can derive the same statements from the Karush-Kuhn- 
Tucker conditions.) Each of H+ and has a transition in the sorted order of 
the Xi’s when projected along the normal w to the parallel pair of hyperplanes. 
For Xi with i G I^, ai — 0 ii Xi ■ w lies on the “right” side of the transition, 

0 < ai < ^ ii Xi ■ w coincides with the transition, and Oi = /r if Xi lies on the 
“wrong” side of the transition. Of course an analogous statement holds for Xi for 

1 G I- ■ As usual, the support vectors are those Xi with > 0. Thus all training 
set errors are support vectors. In Figure ED a) there are six support vectors: two 
transitional unfilled dots (marked Xi and Xj) and one wrong-side unfilled dot, 
along with one transitional and two wrong-side filled dots. 

4 Ellipsoid-Based Algorithms 

We first assume that fx has been fixed in advance, perhaps using some knowledge 
of the expected number of outliers or the desired number of support vectors. We 
give an algorithm for solving formulation (El). 

One approach would be to compute the vertices of 77+^ and and then 
use formulation o with positive and negative training vectors replaced by the 
vertices of and respectively. However, the number of vertices of 
and may be very large, so this algorithm could be very slow. 

So instead we exploit a polynomial-time equivalence between separation and 
optimization (see for example 1 1 t)j . chapter 14.2). The input to the separation 
problem is a point q and a polytope P (typically given by a system of linear 
inequalities) . The output is either a statement that q is inside P or a hyperplane 
separating q and P . The input to the optimization problem is a direction w and 
a polytope P. The output is either a statement that P is empty, a statement 
that P is unbounded in direction w, or a, point in P extreme for direction w. The 
two problems are related by projective duality0 and a subroutine for solving one 
can be used to solve the other in a number of calls that is polynomial in the 
dimension d and the input precision, that is, the number of bits in g or w plus 
the maximum number of bits in an inequality defining P. 

In our case, the polytope is not given by inequalities, but rather as a 
Minkowski sum of line segments; this presentation has an impact on the re- 
quired precision. If the input precision is L, the maximum number of bits in one 
of the feature vectors Xi, then the maximum number of bits in a vertex of the 
polytope is O(d^Llogn). What is new is the O(logn) term, resulting from the 
fact that a vertex of a zonotope is a sum of up to n input vectors. 

^ The more famous direction of this equivalence is that separation — which can be 
solved directly by checking each inequality — implies optimization. This result is a 
corollary of Khachiyan’s ellipsoid method. 
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Theorem 1. Given n explicit feature vectors in and n with 1/n < /r < 1, 
there is a polynomial-time algorithm for computing a soft margin classifier, with 
the number of arithmetic operations linear in n and polynomial in d, L, and 
log n. 

Proof. As in j0|, consider the polytope P that is the Minkowski sum of 
and that is, P = {v+ — V- \ v+ £ and V- G }. We are trying 

to minimize over P the convex quadratic objective function ||w|p, that is, the 
length of a line segment between and 

For a given direction w, we can find the solution v = v+ — V- to the linear 
optimization problem for P by using Lemma El to find the W+ optimizing w over 
and the V- optimizing w over Now given a point q £ we can use 
this observation and the polynomial-time equivalence between separation and 
optimization to solve the separation problem for q and P in time linear in n and 
polynomial in d and L. We can use this solution to the separation problem for P 
as a subroutine for the ellipsoid method (see jnnni) in order to optimize ||w|P 
over P. Given an optimizing choice of ?; = it is easy to find the best 

pair of parallel hyperplanes and a decision boundary, either the G-SVM decision 
boundary or some other reasonable choice within the parallel family. □ 

Now assume that we are in the non-separable case. We shall show how to solve 
for the maximum p for which the reduced convex hulls have non-intersecting 
interior, that is, the p for which the margin is 0. This choice of p corresponds 
to G — >■ 00 and the objective function simplifying to in formulation (EJ. 

This choice of p has two special properties. First, among all settings of C, 
G — > oo tends to give the fewest support vectors. To see this, imagine shrinking 
the shaded regions in Figure El a)- Support vectors are added each time one of 
the parallel hyperplanes crosses a training vector. On the other hand, a support 
vector may be lost occasionally when the number of reduced convex hull ver- 
tices on the parallel hyperplanes changes, for example, if the vertex supporting 
the upper parallel line in Figure EJa) slipped off to the right of the segment 
supporting the lower parallel line. 

Second, the p for which the margin is zero gives a natural measure of the 
separability of two point sets. For simplicity, let |/_|_| = |/_| = n/2 and normalize 
the zero-margin phy p* = {p — 2/n)/(l — 2jn). The separability measure p* 
runs from 0 to 1, with 0 meaning that the centroids coincide and 1 meaning 
that the convex hulls have disjoint interiors. Computing the zero-margin p as 
the maximum value of a dual variable using formulation above is no 
harder than training a G-SVM, and in the case of explicit features, it should be 
significantly easier, as we now show. 

We can formulate the problem as minimizing p subject to 

^ PiX, = ^ P,x„ Pi = l, Pi = l, 0< Pi< p. 

iG/_ iG/+ iG/_ 

As above, let Vi denote {xi, 1), the vector in that agrees with Xi on its first 
d coordinates and has 1 as its last coordinate. Letting ai = Pi/ p, we can rewrite 
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the problem as maximizing 



1/m = X! = X! 



iei+ iei- 



subject to 




(7) 



iei+ iei- 



Yet another way to state the problem is to ask for the point with maximum 
{d + l)-st coordinate in n Z- , where 



Polytopes and Z- are each zonotopes, Minkowski sums of line segments of 
the form Si = { UiVi \ 0 < < 1 }. 

Theorem 2. Let Z\ and Zi be zonotopes defined by a total of n line segments in 
There is an algorithm for optimizing a linear objective function over ^ 10 ^ 2 , 
with the number of arithmetic operations linear in n and polynomial in d, L, and 
log n. 

Proof. Given a point q and zonotope 1 < j < 2, we can use Lemma ^ 
and the polynomial-time equivalence between separation and optimization to 
solve the separation problem for q and Zi in time linear in n and polynomial 
in d, L and log n. We can solve the separation problem for the intersection of 
zonotopes Zi n Z2 simply by solving it separately for each zonotope. We now use 
the equivalence between separation and optimization in the other direction to 
conclude that we can also solve the optimization problem for an intersection of 
zonotopes. □ 

The proof of the following result then follows from the ellipsoid method in 
the same way as the proof of Theorem [0 

Corollary 1. Given n explicit feature vectors in R"^, there is a polynomial-time 
algorithm for computing the maximum fx for which iL|_^ and are linearly 
separable, with the number of arithmetic operations linear in n and polynomial 
in d, L, and log n. 

TheoremOland Corollary Dean be extended to some cases of implicit feature 
vectors. For example, the quadratic kernel k{v,w) = (v ■ w)^ for vectors v = 
(vi,V 2 ) and w = (wi, W 2 ) in R^ is equivalent to an ordinary dot product in R^, 
namely k{v,w) = d>{v) ■ d>{w), where d>{v) = [vf,^viV 2 ,vf). In general Pj, a 
polynomial kernel k(v, w) = (v ■ w)^ amounts to lifting the training vectors from 
R“^ to R*^ where d' = Radial basis functions, however, give d' = 00 , 

and the SVM training problem seems to necessarily involve n-\-d variables. (The 
rather amazing part is that it is a combinatorial optimization problem at all!) 




iG/+ 



iG/_ 
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5 Discussion and Conclusions 

In this paper we have connected SVMs to some recent results in computational 
geometry and mathematical programming. These connections raise some new 
questions, both practical and theoretical. 

Currently the best practical algorithms for training SVMs, Platt’s sequential 
minimal optimization (SMO) JEj and Keerthi et al.’s nearest point algorithm 
(NPA) 0, can be viewed as interior-point methods that iteratively optimize the 
margin over line segments. Both algorithms make use of heuristics to find line 
segments close to the exterior, meaning line segments with weights set to 
either 0 or C. 

Computational geometry may have a practical algorithm to contribute for the 
case of n large and d small, say n ~ 100, 000 and d Ri 20: the generalized linear 
programming (CLP) paradigm of Matousek et al. mu. The training vectors 
need not actually live in for small d, so long as the CLP dimension of the 
problem is small, where the CLP dimension is the number of support vectors in 
any subproblem defined by a subset of the training vectors. 

On the theoretical side, we are wondering about the existence of strongly 
polynomial algorithms for QP problems over zonotopes. Due to the combinatorial 
equivalence of zonotopes and arrangements, the graph diameter of a zonotope 
is known to be only 0(n); polynomial graph diameter is of course a necessary 
condition for the existence of a polynomial-time simplex-style algorithm. 
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Abstract. Let IP = {Pi, . . . ,Pm} be a set of m convex polytopes in 
for d = 2, 3, with a total of n vertices. We present output-sensitive 
algorithms for reporting all k pairs of indices (i,j) such that Pi 
intersects Pj. For the planar case we describe a simple algorithm with 
running time logn + k), and an improved randomized algorithm 

with expected running time 0((n log m -|- k)a(n) logn) (which is faster 
for small values of k). For d = 3, we present an -|- fc)-time 

algorithm, for any e > 0. Our algorithms can be modified to count the 
number of intersecting pairs in log*^^^^ n) time for the planar 

case, and in 0(n®/®+^) time and R®. 



1 Introduction 

Computing intersections in a set of geometric objects is a fundamental problem 
in computational geometry. A basic version of this problem is when the objects 
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are line segments in the plane. Indeed, computing the intersecting pairs in a 
set of n line segments was one of the first problems studied in computational 
geometry: Already in 1979, Bentley and Ottmann j7| described an algorithm 
for this problem with 0((n + fc)logn) running time, where k is the number of 
intersecting pairs of segments. Since then much research has been done on this 
problem, culminating in optimal — that is, with 0(n log n + k) running time — 
deterministic algorithms by Chazelle and Edelsbrunner m and Balaban , and 
simpler randomized algorithms by Clarkson and Shor M and Mulmuley uni. 

Another well-studied variant of the problem is the red-blue intersection prob- 
lem. Here one is given a set of red segments and a set of blue segments, and the 
goal is to report all bichromatic intersections. If there are no monochromatic 
intersections, then the problem can be solved in 0{nlogn + k) time by applying 
an optimal standard line-segment intersection algorithm; when the red segments 
and the blue segments both form simply connected subdivisions, then the prob- 
lem can even be solved in 0{n + k) time 1 1 5] . The situation becomes consider- 
ably more complicated when there are monochromatic intersections. Applying a 
standard line-segment intersection algorithm will not lead to an output-sensitive 
algorithm because it may report a quadratic number of monochromatic inter- 
sections even when there are no bichromatic intersections. Somehow one has to 
avoid processing all the monochromatic intersections. Agarwal and Sharir 0 
showed that one can detect whether the two sets intersect in timeQ 

Later Agarwal [IJ and Chazelle jOj gave 0(71"^/^ log^^^^ n -I- fc)-time algorithm 
to report all k red-blue intersections. Basch et al. ^ presented a determin- 
istic 0 {\t+ 2 {n + fc)log^(n)) algorithm for the case where the set of red seg- 
ments is connected and the set of blue segments is connected; this algorithm 
also works for the case of Jordan arcs, each pair of which intersect at most t 
times. Its running time is 0(A(_|_2(n -I- A:)log^(n)), where \s{n), the maximum 
length of an (n, s) Davenport-Schinzel sequence, is an almost linear function of 
n for any fixed s. This bound was later improved for the case of segments to 
0{{n + /c) log^ (n) log log n) by Brodal and Jacob jH]. Har-Peled and Sharir HU 
give a randomized algorithm with 0 {Xt+ 2 {n- + k) logn) running time for the case 
of Jordan arcs, as above. 

We are interested in the case in which the input consists of convex polygons 
in the plane. We want to compute all intersecting pairs of polygons. More for- 
mally, we are given a set IP = {Pi, . . . , Pm} of m convex polygons in with a 
total of n vertices, and we want to report all k pairs of indices i,j such that Pi 
intersects Pj. (The polygons are considered to be 2-dimensional regions, so two 
polygons intersect also in the case that one of them is fully contained inside the 
other.) If each polygon Pi has constant complexity, then the number of inter- 
sections between pairs of edges will not exceed the total number of intersecting 
pairs of polygons by more than a constant factor, and one can solve the prob- 
lem in 0(nlogn -|- k) time, by a straightforward modification of the algorithms 
mentioned above for reporting segment intersections. If the given polygons do 

^ The meaning of a bound like this is that for any e > 0 there exists a constant c = c(e) 
that depends on e, so that the bound holds with c as the constant of proportionality. 
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not have constant complexity, then the problem becomes considerably harder 
because the intersection of a pair of the given polygons can have many vertices. 
Regarding each input polygon as a collection of segments will thus not lead to 
an output-sensitive algorithm in this case. 

Gupta et al. m nevertheless managed to develop an output-sensitive algo- 
rithm for this case that runs in time + k). The algorithm first computes 

a trapezoidal decomposition for each polygon. Then it computes, using a multi- 
level partition tree, those pairs of intersecting trapezoids such that the leftmost 
intersection point of the trapezoids is also the leftmost intersection point of the 
corresponding polygons. This way it is ensured that each intersecting pair of 
polygons is reported exactly once. 

We develop two new algorithms for this problem. The first algorithm is ran- 
domized and combines hereditary segment trees m with the above mentioned 
red-blue intersection algorithm of Har-Peled and Sharir m Its expected run- 
ning time is 0((n log m -|- k)a{n)logn) and it is significantly faster than the 
algorithm of Gupta et al. when k = o(n^/^). In addition, the algorithm also 
works for convex splinegons (that is, convex shapes whose boundary is com- 
posed of Jordan arcs) with only a minor increase in running time; this is not the 
case for the algorithm of Gupta et al. Our algorithm can be made deterministic 
at the expense of an additional poly logarithmic factor. 

Our second algorithm has log n-|- A:) running time, and is thus slightly 

faster than our first algorithm for k = f7(n'*/^). It is related to the algorithm 
of Gupta et al. — it uses partition trees and similar techniques to search for 
the rightmost intersection points of intersecting pairs of polygons — but it is 
conceptually simpler and it has a slightly better running time. 

The main advantage of our approach over Gupta et al.’s is that it generalizes 
to the 3-dimensional version of the problem: Given a set T = {Pi, . . . , Pm} of 
m convex polytopes in with a total of n vertices, report all k pairs of indices 
(i,j) such that Pi intersects Pj. For this problem, no subquadratic algorithm 
was known. We generalize our second 2-dimensional algorithm, and obtain an 
algorithm with running time + k), for any e > 0. Such a generalization 

seems hard for the algorithm of Gupta et al., as the vertical decomposition of a 
convex polytope can have quadratic complexity. Note that our algorithm for the 
3-dimensional case has the same running time as the best known algorithm for 
the much simpler problem of reporting all intersecting pairs in a set of triangles 
in |2|. 

2 The Planar Case 

Let y = {Pi, . . . , Pm} be a set of m convex polygons in the plane, with a total 
of n vertices. For simplicity, we assume that none of the polygons has a vertical 
edge and that all the vertex coordinates are distinct; we can enforce this in 
0(n log n) time by applying a suitable rotation. For a polygon Pi, we define £i to 
be the leftmost point of Pi and to be the rightmost point of Pi (since there are 
no vertical edges, £i and are uniquely defined). They partition the boundary 
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of Pi into two convex chains: the upper chain, denoted Ui, and the lower chain, 
denoted Li. 

We first describe an algorithm whose running time is near-linear in n and k, 
and then a worst-case optimal algorithm for the case of large k; its worst-case 
running time is log n+ k). 

2.1 A Near-Linear Randomized Algorithm 

We present a randomized algorithm that reports, in 0((n log m -I- k)a{n) logn) 
expected time, all k intersecting pairs of polygons in IP. For each polygon Pi, we 
define Si to be the segment connecting to rp, we call s, the spine of Pi. Let 
SIP denote the set of all the spines. 

Our algorithm starts by constructing a hereditary segment tree T on (the 
x-projections of) the spines of SIP Each node u of T is associated with a 
vertical strip Wy and with a subset ST(u) of spines. A spine Si intersecting Wy 
is short at v if at least one of its endpoints lies in the interior of Wy, otherwise it 
is long. The set SCP(u) is the subset of spines that intersect Wy and are short at 
the parent of v. If v is the root, then ST(u) = SCP. Let T(v) = {Pi \ Si G SJ’(u)}. 
A polygon is short (resp., long) at v if its spine is short (resp., long) at v. As 
shown in l^(^)l = 0(m log m). 

We assume that §iP(u) and lP(u) are clipped to within Wy. At each node v of 
the tree, we will report all pairs (i,j) such that 

(*) the rightmost intersection point of Pi and Pj lies inside Wy and Pi 
is long at v. 

The following lemma is straightforward from the structure of hereditary seg- 
ment trees. 

Lemma 1. For every pair of intersecting polygons Pi and Pj, there is exactly 
one node v of T at which property (*) holds. 

Let ky be the number of pairs that satisfy property (*) at a node v. Then 
Xu Our procedure will ensure that a pair (i,j) is reported only once, at 

the node where (*) is satisfied, but it will spend roughly 0(log n) time for each 
intersecting pair. 

Fix a node v. Let Tl C T(u) denote the subset of long polygons at v, and 
let Ts C T(u) denote the subset of short polygons at v. Denote the set of spines 
of Tl by STi, the set of their upper chains by 11^, and the set of their lower 
chains by T-l. The sets STs, Us, and Ls are defined analogously for the short 
polygons. Again, all these objects are clipped to within Wy. Let Uy denote the 
total number of edges in (the clipped) 7l and Ts- As above, the structure of 
hereditary segment trees implies that Xu ~ 0(n log m). Finally, we define 
i?S to be the set of right endpoints of the spines in §T(u) that lie in the interior 
of Wy. Note that every point in i?s is the right endpoint of an (undipped) 
original spine in ST. Let fiy be the number of intersection points between STs 
and SJ’(u) U dy{v) plus the number of intersection points between the upper 
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(resp. lower) chains of IPl and the lower (resp. upper) chains of IP(u), where 
d‘J’{v) = |i9P P £ J’(n) We have: 

Lemma 2. = 0{k). 



Since all the vertex coordinates are distinct, there exists at most one spine in 
§lP(u) whose right endpoint lies on the right boundary of Wy. We can easily 
compute in 0{riy) time all polygons of J’{v) that contain We now describe 
how we report all the other pairs that satisfy (*) at v. 

We construct, in 0(n„logrij, + /r„) time, the arrangement A = yi(§Ti,) of 
the spines of the long polygons m We also add the vertical lines bounding Wy 
to A. Each face / of is a convex polygon, so we can compute the intersections 
between a line and df in 0(log Uy) time. We preprocess A, in 0((n„ + fiy) log n„) 
time, for planar point-location queries m- For each edge e of lP(u), we locate 
its left endpoint in A and then trace it through A, spending OilogUy) time at 
each face of A that e intersects. 

For each face / £ we report the pairs {i,j) that satisfy (★) and for which 
the rightmost point of Pi fl Pj lies inside /. This is accomplished in the following 
three stages. 



(a) Report all pairs (i,j) such that Piq £ Tl contains the right endpoint Vj £ Rs 
and Tj £ /. 

(b) Report all pairs (i,j) such that the lower chain of Pi £ Tl intersects the 
upper chain of Pj £ !?(?;) and the rightmost point of their intersection lies 
inside /. 

(c) Report all pairs (i,j) such that the upper chain of Pi £ intersects the 
lower chain of Pj £ T(u) and the rightmost point of their intersection lies 
inside /. 



It is easily verified that stages (a)-(c) indeed report all the desired intersec- 
tions. Since (b) and (c) are symmetric, we omit the description of (c). 



Containments of rightmost points. Let R{f) C Rs be the subset of right end- 
points that lie inside /. We wish to report all pairs (i,j) such that Vj £ R{f) lies 
inside Pi £ Tl. Let T(/) C denote the set of long polygons that contain / in 
their interior (i.e., for a polygon P £ T(/), we have / C P), and let Q(/) C 
denote the set of polygons whose boundaries intersect /. Let Uf denote the num- 
ber of vertices of the polygons in Q(/) that lie inside /, and let n'j denote the 
number of edges in Q(/) that intersect / but their endpoints do not lie inside 
/. Then < Uy and X)/ Obviously, |Q(/)| < nf + n'j. Since we 

have already traced the edges of Tl(^’) through A, we have Q(/) at our disposal. 
However, we do not store T(/) explicitly for each face / because the resulting 
storage could be quite large. 

Note that every point in R{f) lies inside every polygon in T(/), so we report 
every pair in 1P(/) x R{f). In order to compute the polygons of “?{/), we perform 
a plane sweep over A and the collection of long polygons. The events of the sweep 
are (i) all the vertices of A\ (ii) left and right endpoints of polygons in Ti; and (iii) 
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intersections of boundaries of polygons in Ti with the edges of A. The number 
of events is 0(n^ + + fc„). The C-structure consists of the intersections of the 

sweep line with the edges of A. Each interval between consecutive intersections 
represents a face / of A\ we store there the current set Q(/). Each polygon Pi of 
T z, appears in at most two intervals of the C-structure, and we associate with it 
the union Ei of the intervals of the C-structure that lie strictly between these two 
intervals. We finally convert the C-structure into a segment tree, which stores 
the intervals Ei. 

Updating the C-structure at each sweep event is easy, and takes logarithmic 
time. When we reach a point rj G Rs, we simply report all the ij intervals Ei 
that contain the interval of the face of A that contains . This can be done in 
time 0{lj A logn«). In total, this step takes time 0{{riy + yiy + ky) logn„). 

Next, for every point G R{f), we report the polygons in Q(/) that contain 
rj. We build a union tree E on the polygons in Q(/), which is a minimum- 
height binary tree whose leaves store the polygons of Q(/). Each node ^ of S' is 
associated with the subset Qj C Q(/) of polygons that are stored at the leaves 
of the subtree rooted at Let be the total number of vertices of the polygons 
in Qj that lie in the interior of /, and let v'^ be the number of edges of the 
polygons in that intersect / but whose endpoints do not lie inside /; we 
have = Olriflogriy) and = 0(n'j log n„). Let (resp., U^) denote 

the set of maximal connected portions of the lower (resp., upper) chains of the 
polygons in that lie inside /. At each node we compute the lower envelope 
Lj of and the upper envelope of U^. These envelopes have 0{{i^^+v'^)a{ny)) 
breakpoints. If we have already computed the lower and upper envelopes of the 
children of then can be computed in an additional -I- ly^)a(ny)) 

time. We store the sequences of breakpoints of (and C/j) in an array, sorted 
from left to right. For each breakpoint, we store the segment that appears on the 
envelope immediately to its left. We also apply fractional cascading m so that 
for a given ^-coordinate Xq, if we know the breakpoint of Lj (resp. Uj) that is 
immediately to the right of xq, we can compute, in 0(1) time, the breakpoints 
of Lri (resp. 11^, II,,) that lie to the right of xq, where C, ij are the children of 
The total time spent in preprocessing E is 0((n/ -I- n'j)a{ny) logn„). 

For each point r^ G R{f), we find all polygons in Q(/) containing rj by 
traversing the union tree in a top-down manner. Suppose we are at a node ^ 
of E. Since / is not crossed by any spine, rj does not lie in any polygon of Qj 
if and only if rj lies below all the chains in Lj (i.e., lies below Lj) and above 
all the chains in (i.e., lies above U^). We thus find the breakpoints of 
that lie immediately to the right of rj and determine in 0(1) time whether rj 
lies below and above U^. If the answer is yes, we conclude that rj does not 
lie in any polygon of Qj, and we stop. If ^ is a leaf and rj lies inside the only 
polygon, say Pi, in Q^, then we return the pair (i,j). If ^ is not a leaf and rj 
lies inside a polygon of Qj, we recursively visit the children of Suppose rj lies 
inside kj polygons of Q(/), then the query procedure visits 0(1 + kj logn„) nodes 
of E. It spends 0(logn„) time at the root and 0(1) at any other node, so the 
time spent in processing rj is 0((1 -I- kj) log Uy). Hence, the algorithm spends 
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Fig. 1. A face / of A, the associated sets U (/), L{f), and §?’(/), and an added vertical 
segment. 



0{{nf + n'j)a{nv) + time at face /. Summing over all 

the faces of A, we obtain that the total time spent in reporting the pairs that 
satisfy condition (a), over all faces / of A, is 0 ((n^ + /j,„ + ky)a{ny) logn„). 

Intersections between long lower chains and upper chains. For a face / of A, let 
L{f) denote the set of maximal connected portions of the chains in that lie 
inside /, let U (/) denote the set of maximal connected portions of upper chains 
of (short and long) polygons in T(u) that lie inside /, and let §1P(/) denote the 
set of portions of short spines inside /. Since we have traced the edges of lP(u) 
through A, the sets L{f) and U{f) are already available for all faces /. We will 
report all pairs (i,j) that satisfy (*) and whose rightmost intersection points lie 
inside /. See Figure Q for an illustration. 

The endpoints of all chains in L{f) lie on df because they are portions 
of long chains. Let Af he the set of edges that constitute L{f) and df] set 
a/ = \ Af\. The union of A/ is connected. If both endpoints of a chain 7 S U{f) 
lie in the interior of /, then 7 is the entire upper chain of a short polygon 
Pj. In this case, we add a vertical segment aj from the right endpoint rj of 
Pj downwards until it meets df. Let Bf denote the union of the set of edges 
that constitute U{f) and df, and the set of vertical segments that we have just 
added; set bf = \Bf\. By construction, the union of Bf is also connected because 
all the upper chains in U{f) are connected to df after introducing the vertical 
segments. Since the unions of and of Bf are both connected, we can use the 
randomized algorithm of Har-Peled and Sharir ca to compute all / f intersection 
points between the segments of Aj and of Bf that lie in the interior of /, in 
0 ((a/ + bf + If)a{ny) logn„) expected time. 

The total expected running time spent in reporting the pairs that satisfy 
property (b) is 'YhfO{{af + bf + //)a(n„) logn„). Each endpoint of a segment 
of Aj or of By is either a vertex of IP(u), or an intersection point of a long spine 
and an edge of T’(u), or the lower endpoint of a vertical segment cry. Therefore, 
S/(®/ + ^/) = 0(jiy + pLy). The expected running time is thus 0{{ny + /i„ + 

Y,f If) a{ny) log Uy). 
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We call an intersection point of e G Af and e' G Bf real if e is an edge of a 
lower chain in L{f) and e' is an edge of an upper chain in U{f); otherwise we 
call the intersection point virtual. We report a pair (i,j) if there exists an edge 
6i of Pi in A f and an edge Cj of Pj in Bf such that the intersection point of 
and Cj is the rightmost vertex of Pi fl Pj . 

Each real intersection point is an intersection point of and the upper 
chains of lP(u), so the total number of real intersection points, summed over 
all faces of A, is 0(/r„). Since df does not intersect the relative interior of any 
segment in U{f) or L(f)., a virtual intersection point is an intersection point 
e n e', where e is an edge of the lower chain of a long polygon Pi and e' is the 
vertical segment aj emanating from the right endpoint rj of (the upper chain of) 
a short polygon Pj . We can ignore intersections on df because they correspond 
to degenerate intersections between Af and Bf, and, in any case, their number 
is only Since Pi is a long polygon, its spine Si is in STi,. Therefore, Si lies 

above the interior of the face / and thus above rj. The intersection of e and aj 
implies that rj is inside Pi. We charge the intersection point efle' to the pair (i,j). 
Each pair (i,j) is charged by at most one virtual intersection point and the pair 
(z, j) is reported at v, therefore the total number of virtual intersection points, 
summed over all faces of is at most Hence, // = + /r„), and the 

total expected time spent in executing stage (b) is 0((n„ + fci, + /i„)a(n„) logn^). 

We have thus described procedures for reporting all intersecting pairs that 
satisfy properties (a)-(c) at a node v of T. The total expected time we spend at 
V is 0{{riy + ky + fiv)a{nv) logn«). Since J2v = 0(n log m), J2v 
J2vl^v = 0{k) (Lemma EJ, we obtain the following result. 

Theorem 1. Let T = {Pi, . . . , Pm} be m convex polygons in the plane with a 
total of n vertices. All k pairs of indices (i,j) such that Pi intersects Pj can be 
reported in 0((n log m + k)a{n)logn) expected time. 

Remark 1. (i) To get a worst-case time bound instead of an expected time bound, 
we can replace the algorithm of Har-Peled and Sharir mi used in the second 
part of the algorithm by an algorithm of Basch et al. . This will increase the 
time bound by a polylogarithmic factor. 

(ii) The algorithm also works when the boundaries of the polygons are com- 
posed of Jordan arcs instead of straight edges, provided the polygons are still 
convex. If t is the maximum number of times any pair of Jordan arcs intersect, 
the running time of the algorithm becomes 0((At_|_2(n) logm -I- At+ 2 (fc)) log n). 



2.2 An Alternative Deterministic Algorithm 

Let Pi and Pj be two intersecting polygons of T. As above, the rightmost vertex 
of Pi n Pj is either r j , or Vj , or an intersection point of the upper chain of Pi with 
the lower chain of Pj, or an intersection point of the lower chain of Pi with the 
upper chain of Pj. Using this observation, we can report the intersecting pairs 
of polygons as follows. 
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Let V = {ri\l <i < m}. We first report all intersecting pairs of polygons for 
which the rightmost vertex of the intersection polygon is the rightmost vertex of 
one of the two polygons. A vertex is the rightmost vertex of PidPj if and only 
if Ti G Pj. For each Pi, we therefore report PiDV. Using the range-searching data 
structure of Matousek m, we preprocess V, in time 0{{rnP^^n^^^+n) log n), into 
a data structure of size +n), and query it with each Pi. For a polygon 

Pi, all fj,i points of Pi fl U can be reported in time 0(|Pi|(m^/^/n^/^) logn + ^i). 
Hence, the total time spent in this step is log n -I- n log n + fi) where 

^^ = E7=l\P^r^v\<k. 

Next, we report the pairs (i,j) such that the rightmost vertex of PiDPj is an 
intersection point of an edge of Pi with an edge of Pj. Let U be the set of segments 
in the upper chains of the polygons in 7, and let L be the set of segments in the 
lower chains of these polygons. We compute all i' intersecting pairs of segments 
between U and L. This can be accomplished in log^^^ n + ly) time 1 1 ID] . 

Suppose that an edge e of the upper chain of Pi and an edge e' of the lower chain 
of Pj intersect. We check in 0(1) time whether e fl e' is the rightmost vertex of 
Pi n Pj, and, if so, report the pair (i,j). Since an upper chain intersects a lower 
chain in at most two points, the number of intersections between U and L is at 
most 2k, where k is the number of intersecting pairs of polygons in P. 

Hence, we obtain the following result. 

Theorem 2. Let 7 be a set of m convex polygons in the plane with a total of n 
vertices. All k pairs of indices (i,j) such that Pi intersects Pj can he reported in 
0(rA!^ log n + k) time. 

Remark 2. (i) Since the data structure in m can count the number of points 
lying inside a fc-gon in time 0(fc(m^/^/n^/^) log n) time using + 

m) log n) preprocessing and the number of intersections between L and U can be 
counted in time 0{n^^^logn) time, the number of intersecting pairs of polygons 
can be counted in time log n). 

(ii) As in Agarwal and Sharir 0, we can use a more sophisticated data 
structure to improve the running time of the algorithm to log'^ n+k), 

for an appropriate constant c. 



3 The Three-Dimensional Case 

Let 7 = {Pi, . . . , Pm} be a set of m convex polytopes in with a total of 
n vertices. We present an algorithm, with running time -I- k), for any 

e > 0, which reports all k pairs of indices (i,j) such that Pi intersects Pj. Our 
approach is similar to the algorithm described in Section l?~^ We compute the 
bottom vertex, i.e., the vertex with the minimum z-coordinate, of each nonempty 
intersection polytope Pij = Pi L\Pj, and report the corresponding pairs (i,j). 
The bottom vertex of an intersection polytope Pij is either the bottom vertex of 
Pi, or the bottom vertex of Pj, or the intersection point of an edge of Pi and a 
face of Pj , or the intersection point of a face of Pi and an edge of Pj . In the two 
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Fig. 2. An arc jpg and a spherical triangle Apqr. 



latter cases, the intersection has to satisfy some additional properties, which we 
describe and exploit below. 

Let bi be the bottom vertex of Pi, and let F = {6^ | 1 < i < m}. We first 
report all pairs (z, j) such that the bottom vertex of Pij is the bottom vertex of 
Pi or of Pj. A vertex bi G V is the bottom vertex of Pij if and only bi G Pj. 
Therefore, for each Pj G T, we need to compute and report Pj fl V. As in 
Section El we can accomplish this in time log'^ n + fj.), for some 

constant c, where fj, = \Pj ^ ^1 ^ using the range-searching algorithm 

of Matousek m 

Next, we report all pairs {i,j) such that the bottom vertex of (the nonempty) 
Pij is an edge-face intersection. Let E and F denote the sets of edges and of 
faces, respectively, of the polytopes in T. Using the data structure of Agarwal 
and Matousek |2], we can compute, in time, for any e > 0, a family 

of pairs P = {{Ei,Fi), . . . , (Ep, Fp)}, such that 

(i) Pi ^ E and Fi C F, for all 1 ^ ^ r; 

(ii) every edge in Ei crosses every face of Fj, for all 1 ^ z ^ r; 

(iii) for every crossing edge-face pair (e, f) G E x F, there is an z so that e G Ei 
and f G Fi, and 

(iv) Y.U{m + \FA) = 0{n^/^+n. 

We will describe an algorithm that, for a given pair (Ei,Fi), computes, in 
time 0((|Fj| -|- |Fi|)log^zz -|- Vi), all Vi pairs (e, /) G Ei x Fi such that e fl / 
is the bottom vertex of the corresponding intersection polytope. Repeating this 
procedure for all pairs of T , we report, in time 0(rz®/®+^ -1- v) (for a slightly 
larger, but still arbitrarily small e > 0), all pairs {i,j) such that the bottom 
vertex of P^ is the intersection of an edge-face pair. 

Consider a pair (Ei,Fi) from the family T . For each edge e G Ei (resp., each 
face / G Fj), let Pe G y (resp., Pf G T) be the polytope containing e (resp., 
/). Let be the unit sphere of directions in K.®, and let x = (0> 0: be the 

south pole of S^. For two points p,q G that are not antipodal, let 7 pq C 
be the shorter arc of the great circle passing through p and q. For three points 
p,q,r G no two of which are antipodal, let Apqr be the smaller spherical 
triangle formed by the arcs 'jpq,Xqp, and jpr. See Figure |2l 
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Let Uf denote the outward unit normal of the face /. For an edge e, let 7 e be 
the great circular arc representing all outward normals to the planes supporting 
Pe at e. The endpoints ^ and rj of 7 e are the outward normals of the faces of 
Pe incident upon e, and 7 e = 7^,,. For an edge e € Ei and a face f € Fi, let 
Tef = A^ijUf be the spherical triangle formed by 7 e, J^nf, and j-qiif', ^e/ is the 
set of outward normals supporting Pg fl Pf at the vertex e fl /. The following 
lemma is straightforward but crucial to our analysis. 

Lemma 3. For a pair (e, f) G EiX Fi, the intersection point eC\f is the bottom 
vertex of P^ fl Pf if and only if x € Tef. 

In order to find the edge-face pairs with the above property, we define a 
spherical triangle Ae for each edge e G Pi as follows. Let p and q be the antipodal 
points of the endpoints of 7 e, and let Xe antipodal arc of 7 e, i.e., the set of 

points that are antipodal to the points on Xe- We define Ae to be the spherical 
triangle Apq\, which is bounded by the arcs 7e,7px> 7gx- define 

We to be the spherical wedge that contains the arc Xe is formed by the 
meridians passing through p and q. Finally, let Fie be the hemisphere containing 
Ae and bounded by the great circle containing Xe and Xe (this circle is the set 
of normals to the planes passing through the edge e). Then Ag = iLe Fl We- 
lt can be easily checked that y G Tef if and only if ny G Ag, which implies 
the following lemma. 

Lemma 4. For a given pair (e, f) G Ei x Fi, the intersection point e fl / zs the 
bottom vertex of Pe fl Pf if and only ifrifG Ae- 

Let A = {Ag I e G Pi} and N = {n/ | / G Pi}. For each Ag G A, we wish to 
report AgfliV. Recall that Ag = WeflHe- We thus preprocess N into a two-level 
data structure — the first level reports, for any query Ag, all points of We fl iV as 
the union of 0(log |Pi|) canonical subsets, and the second level reports all points 
of the canonical subsets that lie inside Pg. More precisely, we proceed as follows. 
We sort the points in N by their longitudes and construct a minimum-height 
binary tree T on the sorted point set (we omit the easy details concerning the 
handling of the circularity of this order) . Each node u, of P is associated with the 
subset Nu C TV of points that are stored at the leaves of the subtree rooted at 
u. We preprocess TV„ for hemisphere reporting queries, where each query reports 
all points of lying inside a query hemisphere P C By using a halfplane 
reporting structure nn, we can preprocess Nu, in 0(|TV„| log \Nu\) time, into a 
data structure of size 0(|P„|), so that a hemisphere query can be answered in 
0(log |P„|-|-t) time, where t is the output size. We attach this structure at u as its 
secondary structure. The total time spent in preprocessing N is 0(|Fi| log^ l^d)- 
For an edge e G A, we report AgflP as follows. By searching with the longitudes 
of the endpoints of %, we first find, in 0(log |Fi|) time, a set Ue of 0(log |Fi|) 
nodes of T, so that UuGC/e 'b N. For each node u G C/g, we report 

all tu points of TV„ fl Ag in 0(log |Fi| -I- tu) time, by searching with Pg in the 
secondary structure attached to u. Therefore the total time spent in reporting 
all tg points of Ag fl TV is 0(log^ |Pi| -I- tg). Hence, the overall time spent in 
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reporting all v pairs of Ei x Fi such that e fl / is the bottom vertex of Pe H Pf 
is 0{{\Ei\ + log^ |i^i| + v). 

Summing up all the bounds, and replacing e by a slightly larger, but still 
arbitrarily small constant, we obtain the following. 

Theorem 3. Given a set 7 of m polytopes in with a total of n vertices, 
we can report all k pairs of indices (i,j) such that Pi and Pj intersect, in time 
q(.^8/5+£ _|_ cQjigifijii e > 0. 

Remark 3. The above algorithm can also be modified to count, in 
time, the number of all intersecting pairs of polytopes in 7. 

4 Conclusions 

In this paper, we presented output-sensitive algorithms for reporting all inter- 
secting pairs of convex polygons / polytopes in two and three dimensions. For 
the planar case, we presented a near-linear-time algorithm for this problem. 

An open question is whether there exists an o(m^)-time algorithm for re- 
porting all pairs of intersecting polytopes in a set IP of m convex polytopes in 
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Abstract. In this paper we provide an algorithmic approach to the 
study of online anctioning. From the perspective of the seller we formal- 
ize the auctioning problem as that of designing an algorithmic strategy 
that fairly maximizes the revenue earned by selling n identical items to 
bidders who submit bids online. We give a randomized online algorithm 
that is 0(log i?)-competitive against an oblivious adversary, where the 
bid values vary between 1 and B per item. We show that this algorithm 
is optimal in the worst-case and that it performs significantly better 
than any worst-case bounds achievable via deterministic strategies. Ad- 
ditionally we present experimental evidence to show that our algorithm 
outperforms conventional heuristic methods in practice. And finally we 
explore ways of modifying the conventional model of online algorithms 
to improve competitiveness of other types of auctioning scenarios while 
still maintaining fairness. 



1 Introduction 

Although auctions are among the oldest forms of economic activity known to 
mankind, there has been a renewed interest in auctioning as the Internet has 
provided a forum for economic interaction on an unprecedented scale. Indeed, 
a number of web sites have been created for supporting various kinds of auc- 
tioning mechanisms. For example, at www.priceline.com, users present bids on 
commodity items without knowledge of prior bids, and the presented bids must 
be immediately accepted or rejected by the seller. Alternately, web sites such as 
www.ebay.com and www.ubid.com allow bidding on small lots of non-commodity 
items, with deadlines and exposure of existing bids. The rules for bidding vary 
considerably, in fact, even in how equal bids for multiple lots are resolved. In- 
terestingly, it is a simple exercise to construct bidding sequences that result in 
suboptimal profits for the seller. For example, existing rules at www.ubid.com 
allow a $100 bid for 10 of 14 items to beat out two $70 bids for 7 items each. 
Thus, we feel there could be considerable interest in algorithmic strategies that 
allow sellers to maximize their profits without compromising fairness. 

* A significant portion of this work was done while this author was at IBM’s India 
Research Lab. 
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Since Internet auctioning requires a great deal of trust, notions of fairness 
and legitimacy are fundamental properties for an auction [22|. Indeed, there is 
considerable previous mathematical treatments of auctioning that classify vari- 
ous types of auctions, together with evolving criteria of fairness (e.g., see 
El). Still, the algorithmic aspects of auctioning have been largely neglected. 

1.1 The Topic of Interest 

Given the interactive nature of Internet auctioning today, we feel it most ap- 
propriate to study auctioning strategies from an online algorithms perspective. 
That is, algorithms must make immediate decisions based on existing, incomplete 
information, and are not allowed to delay responses to wait for future offers. 

Moreover, given that existing auctioning web sites must implement what 
are essentially algorithmic rules for accepting or rejecting bids, in this paper 
we focus on algorithmic strategies for sellers. Even so, we restrict our study to 
strategies that are honest and fair to buyers. For example, we would consider 
as unacceptable a strategy that uses a fictitious external bidder that causes a 
real bidder to offer more than he would had there been less bidding competition. 
Besides being unethical, dishonest or unfair strategies are ultimately detrimental 
to any Internet auction house anyway, since their discovery drives away bidders. 



1.2 Previous Related Work 



Offline scenarios for auctioning, where all bids are collected at one time, such 
as in sealed bid auctions, have been studied and understood in terms of knap- 
sack problems, for which the algorithms community has produced considerable 
previous work . We are not aware of much previous work on online 

auctioning strategies, however. 

The general area of online algorithms 0 studies combinatorial optimization 
problems where the problem instance is presented interactively over time but 
decisions regarding the solution must be made immediately. Even though such 
algorithms can never know the full problem instance until the end of the se- 
quence of updates (whose arrival itself might not even be known to the online 
algorithm), online algorithms are typically compared to optimal offline algo- 
rithms. We say that an online algorithm is c-competitive with respect to an 
optimal offline algorithm if the solution determined by the online algorithm dif- 
fers from that of the offline algorithm by at most a factor of c in all cases Q 
The goal, therefore, in online algorithm design is to design algorithms that are 
c-competitive for small values of c. Often, as will be the case in this paper, we 
can prove worst-case lower bounds on the competitive ratio, c, achievable by an 
online algorithm. Such proofs typically imply an adversary who constructs input 
sequences that lead online algorithms to make bad choices. In this paper, we 
restrict our attention to oblivious adversaries, who can have knowledge of the 



^ As a matter of convention we can never have a c-competitive algorithm for c < 1 
(such algorithms would instead be called 1/c-competitive) 
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online algorithm we are using, but cannot have access to any random bits that 
it may use. 

In work that is somewhat related to online auctioning, Awerbuch, Azar and 
Plotkin (jSj) study online bandwidth allocation for throughput competitive rout- 
ing in networks. Their approach can be viewed as a kind of bidding strategy for 
bandwidth, but differs from our study, since in bandwidth allocation the problem 
of communication path determination is at least as difficult as that of bandwidth 
capacity management. Leonardi and Marchetti-Spaccamela (H3) generalize the 
result of Awerbuch et al, but in a way that is also not directly applicable to 
online auctioning, since, again, there is no notion of path determination in online 
auctioning. 

Work for online call control nm is also related to the problems we con- 
sider. In online call control, bandwidth demands made by phone calls must be 
immediately accepted or rejected based on their utility and on existing phone 
line usage. In fact, our work uses an adaptation of an algorithmic design pattern 
developed by Awerbuch et al. ^ and Lipton (El, which Awerbuch et al. call 
“classify-and-select.” In applying this pattern to an online problem, one must 
find a way to partition the optimization space into q classes such that, for each 
class, one can construct a c-competitive algorithm (all using the same value of 
c) . Combining all of these individual algorithms gives an online algorithm with a 
competitive ratio that is 0{cq). Ideally, the individual c-competitive algorithms 
should be parameterized versions of the same algorithm, and the values c and 
q should be as small as possible. Indeed, the classify-and-select pattern is best 
applied to problems that can be shown to require competitive ratios that are 
fi{cq) in the worst case against an oblivious adversary. 



1.3 Our Results 

We consider several algorithmic issues regarding online auctioning, from the 
viewpoint of the seller, in this paper. We begin, in Section 0 by defining the 
multiple-item B-bounded online auctioning problem in which bidders bid on mul- 
tiple instances of a single item with each bidder allowed to bid for as many items 
as he or she wants to. We present an online algorithm for this problem that is 
0(log i?)-competitive with an oblivious adversary. The upper bound result pre- 
sented in this sections is based on adaptations of the classify-and-select design 
pattern 0 to the specific problem of online auctioning. 

In Section 0 we show that it is not possible for any deterministic algorithm 
to provide a satisfactory competitive ratio for this problem. Moreover, we show 
that the algorithm we give in Section |2| is “optimal” in the sense that no ran- 
domized algorithm can achieve a competitive ratio of o(logS). To do this we 
derive lower bounds, based on novel applications of Yao’s “randomness shifting” 
technique m , that show the competitive ratios for our algorithm is worst-case 
optimal. 

In order to show that our algorithm performs well in practice we undertook 
a number of experiments. The results, detailed in Section 0 demonstrate that 
our algorithm handles different types of input sequences with ease and is vastly 
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superior to other online strategies in the difficult case where the bids vary greatly 
in size and benefit. 

Finally, in Section0we discuss a possible modification of our model with an 
eye towards making it more flexible as regards its online nature so as to gain 
in terms of competitiveness. In particular, we allow the algorithm to “buffer” 
a certain number of bids before making a decision. We show that by buffering 
only 0{logB) bids it is possible to be c-competitive with an oblivious adversary 
for the case in which we are selling a single item, for a constant c. 



2 Multiple-Item S-Bounded Online Auctioning 

In this section we introduce the multiple-item B-bounded online auctioning prob- 
lem. We have n instances of the item on sale and the bids which come in for 
them offer varying benefit per item. Each bid can request any number of items 
and offer a given benefit for them. The objective is to maximize the profit that 
can be earned from the sequence of bids with the additional requirement that 
the seller accept or reject any given bid before considering any future bids, if 
they exist. 

The price density of a bid is defined as the ratio of the benefit offered by 
the bid to the number of instances of the item that the bid wants to buy. In 
other words the price density is the average price per item the bidder is willing 
to pay. The range of possible price densities that can be offered is between 1 and 
B, inclusively. This restriction is made without loss of generality in any scheme 
for single-item bidding that has bounded bid magnitude, as we can alternately 
think of B as the ratio of the highest and lowest bids that can possibly be made 
on this item. A sequence of bids need not contain the two extreme values, 1 and 
B, and any bid after the first need not be larger than or equal to the previous 
bid. 

We assume that the algorithm knows the value of B. We discuss at the end 
of this section how this assumption can be dispensed with. 

For this problem we propose an algorithm that uses an adaption of a random 
choice strategy of Awerbuch and Azar |5] together with the “classify and select” 
technique of 0, where we break the range of possible optimization values into 
groups of ranges and select sets of bids which are good in a probabilistic sense 
based on these ranges. Our algorithm is described in Figure Q 

Theorem 1. PriceAnd_Pack is an O(logS) competitive algorithm for the mul- 
tiple item B-bounded online auctioning problem. 

The proof of this theorem is in Appendix E] ■ 

An important thing to note is that here the algorithm has to know the range 
of the input i.e. the algorithm has to be aware of the value of B. It is possible 
to dispense with this assumption to get a slightly weaker result following 
In other words, it is possible to give an 0((log competitive algorithm, 

for any e > 0, which does not know the value of B beforehand. We do not 
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Algorithm Price_And_Pack 

— Select i uniformly at random from the integers 0 to log-B — 1. 

— If i is 0 then set pdr = 1 else set pdr = 2*“^. 

— Dehne a bid as legitimate if it has a price density of at least pdr 

— Toss a fair coin with two outcomes before any bid comes in. 

— If the coin has landed heads then wait for a legitimate bid for more than 
n/2 items to come in rejecting all smaller bids and all illegitimate bids. 
— Else keep accepting legitimate bids till there is capacity to satisfy them. 
Reject all illegitimate bids. 



Fig. 1. Price_And_Pack: Auctioning multiple items with bids of varying benefit. 



detail it here because it does not provide any further insight into the problem of 
auctioning. 

In the next section we give lower bounds which will show that Price-And-Pack 
gives the best possible competitive ratio for the this problem. 

3 Lower Bounds for the Online Auctioning Problem 

We consider the version of the online auctioning problem in which there is only 
one item to be auctioned and the range of possible prices that can be offered for 
this item is between 1 and B, inclusively. We call this the single-item B-bounded 
online auctioning problem. We give lower bounds for this problem. Upper bounds 
for this problem are given in It is easy to see that a lower bound on any 
algorithm for the single-item problem is a lower bound for the multiple-item 
problem as well. 

In this section we first prove that no deterministic algorithm can in the 
worst case have a competitive ratio better than the maximum for the single- 
item problem. More precisely, we show that every deterministic algorithm must 
have a worst-case competitive ratio that is fi{B). This lower bound is based on 
the fact that a seller does not know in advance how many bids will be offered. 
Even so, we also show that even if the seller knows in advance the number of bids 
in the input sequence, any deterministic algorithm is limited to a competitive 
ratio that is Q{\/l3) in the worst case. 

Theorem 2. Any deterministic algorithm for the single-item B-hounded auc- 
tioning problem has a competitive ratio that is Q{B) in the worst case. 

Proof: For a given deterministic algorithm A we construct an adversarial input 
sequence I a in the following way: Let the first bid in I a be of benefit 1. If A 
accepts this bid, then I a is the sequence {I, B}. In this case, on the sequence Ia, 
the deterministic algorithm A gets a benefit of 1 unit while the offline optimal 
algorithm would pick up the second bid thereby earning a benefit of B units. 

If A does not accept this first bid, then I a is simply the sequence {!}. In this 
case A earns 0 units of revenue while the optimal offline algorithm would accept 
the bid of benefit 1. ■ 
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Of course, B is the worst competitive ratio that is possible for this problem, 
so this theorem implies a rather harsh constraint on deterministic algorithms. 
Admittedly, the above proof used the fact, perhaps unfairly, that the seller does 
not know in advance the number of bids that will be received. Nevertheless, as we 
show in the following theorem, even if the number of bids is known in advance, 
one cannot perform much better. 

Theorem 3. Any deterministic algorithm for the single-item B-bounded online 
auctioning problem, where the number of bids is known in advance, has a com- 
petitive ratio that is l7(\/i3) in the worst case. 

Proof: Consider the input sequence hase = {1, 2, 4, ... 2® . . .B/2, B}. For any 
deterministic algorithm A we construct our adversarial sequence Ia based on 
what A does with Ibase- 

We recall here that since we are considering the single-item problem, any 
deterministic algorithm essentially picks at most one of the bids in the input 
sequence. 

Suppose A accepts some bid 2* < '/B- Then we choose I a to be the same as 
hase- In this case A’s benefit is less than V~B, whereas an optimal offline algo- 
rithm would earn B units thereby making A an Q{\/~B) competitive algorithm. 

If A accepts some bid 2* > \/~B, on the other hand, then we choose I a to be 
{1, 2, 4, . . .2*“^, 1, 1, . . .}, i.e., we stop increasing the sequence just before A 
accepts and then pad the rest of the sequence with bids of benefit 1. 0 This way 
A can get no more than 1 unit of benefit while the optimal offline algorithm gets 
2*“^ which we know is at least \/B. 

If A accepts none of the bids in /base then it is not a competitive algorithm 
at all (i.e. it earns 0 revenue while the optimal offline algorithm earns B units) 
and so we need not worry about it at all. ■ 

It is easy to see that the deterministic algorithm that either picks up a bid 
of benefit at least \/B or, if it does not find such a bid, picks up the last bid, 
whatever it may be, succeeds in achieving a competitive ratio of 0{'/B). 

Theorem |3 tells us that no deterministic algorithm can effectively compete 
with an oblivious adversary in the worst case, if the number of bids is not known 
in advance. Indeed, although the proof used a sequence that consisted of either 
one bid or two, the proof can easily be extended to any sequence that is either of 
length n or n-\-l. This bleak outlook for deterministic algorithm is not improved 
much by knowing the number of bids to expect, however, as shown in Theorem 0 

Furthermore we show that even randomization does not help us too much. 
We can use Yao’s principle |21 to show that no randomized algorithm can be 
more competitive against an oblivious adversary than SelLOne. 

Theorem 4. Any randomized algorithm for the single-item B-bounded online 
auctioning problem is f2 {log B)- competitive in the worst case. 

The proof is sketched in Appendix |H1 ■ 

^ Bids are not always increasing. For a single item there is no issue at all if the bids 
are always increasing and the number of bids is known. Just wait for the last bid. 
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4 Experimental Results 

In order to give an idea of the efficacy of Price-And^Pack we present the results 
of simulated auctions which use this algorithm. 

The input sequences were generated by selecting each bid from a given prob- 
ability distribution. The three distributions used were: Normal, Poisson and 
Uniform. Both the number of items being bid for and the price density offered 
by the bid were chosen from the same distribution. 

We chose three different combinations of n and B and generated 100 input 
sequences for each combination. To get a good approximation to the average 
benefit of Price-And-Packwe ran the algorithm 1000 times on each instance and 
averaged the benefit over all these runs. 

We determined a lower bound on the amount of revenue obtained by our al- 
gorithm compared to the maximum possible revenue. To do this we implemented 
an offline algorithm which has been shown to be a 2 approximation im. By di- 
viding the revenue obtained by Price-And-Pack by 2 times the revenue obtained 
by the offline algorithm we were able to provide a number which is effectively a 
lower bound on the actual ratio. 
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Fig. 2. Price_And_Pack v/s the optimal offline algorithm. 



The numbers in Figure 0 show that in practice Price-And_Pack performs 
quite well compared to the optimal offline algorithm and significantly better 
than the bound of 0(logi?) would suggest. We see that in the two distributions 
which tend to cluster sample points near the mean, i.e. Normal and Poisson, the 
algorithm does especially well. However these distributions provide fairly regular 
input instances. The real power of Price-And_Pack is on view when the input 
instances have widely varying bids. 

To demonstrate this we compared the performance of a simple Greedy heuris- 
tic with the performance of Price -And_Pack. Greedy simply accepts bids while 
it has the capacity to do so. In Figure 0 we present the results in terms of 
the percentage extra revenue Pricc-And-Pack is able to earn over the Greedy 
heuristic. 
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Fig. 3. Price_And_Pack over Greedy: % advantage. 




Fig. 4. Price_And_Pack over Greedy: % advantage in 100 individual runs. 



We see that when the bids are comparable to each other (i.e. when they 
are generated by the Normal or Poisson distribution) then Price-And_Pack does 
not do significantly better than Greedy but when the bids vary widely in size 
(i.e. when they are generated by the Poisson distribution) then Price-And-Pack 
definitely outperforms Greedy. 

In Figure0we graph the percentage extra revenue earned by Price-And-Pack 
in 100 different input instances for a given choice of n and B. It is clear from 
the graph that Price -And-Pack consistently outperforms Greedy. 



5 Improving Competitiveness by Giving Intermediate 
Information 

In this section we look at a way of modifying the auctioning model to improve 
competitiveness. Taxonomies of auctions (for eg. PS], EH]) have classified auc- 
tions along three broad categories: bid types, clearance mechanisms and inter- 
mediate information policies. We look at this lattermost classification to help us 
improve competitiveness. 



Seller-Focused Algorithms for Online Auctioning 143 



In the preceding sections we saw that in the conventional online model, where 
every bid has to be accepted or rejected before the next bid comes in, we are 
limited to a competitive ratio of l7(logi?). However it is possible to do better if 
we relax the online model slightly. In the model under consideration so far every 
bid has to be accepted or rejected immediately, or, more precisely, before the 
next bid comes in. However in real life auctions this is not always the case. Most 
auctions do release some intermediate information. For example in the outcry 
type of auction the current highest bid is announced. This amounts to informing 
those who bid at that level that their bid is still under consideration, although 
it might yet be beaten out by a better bid. 

The problem with the outcry auction is that, in the case of a monotonically 
increasing sequence of bids, each bid is asked to hold on and then rejected i.e. 
0{B) bids are made to wait till a new bid comes in before being rejected. This 
is clearly unacceptable since from the point of view of a bidder an intermediate 
notification that the bid is still under consideration is tantamount to saying that 
this bid has a reasonable chance of success. However if the bidder knows that 
0{B) bids could be asked to hold then he might not consider this chance of 
success reasonable. 

So, the model we propose is that only a certain small number of bids can be 
asked to wait without a definite notification of acceptance or rejection. We can 
think of these bids being buffered in a buffer which will need to contain only one 
item at a time and will not be allowed to hold more than a certain small number 
of items in the course of the auction. We call this structure a k-limited access 
buffer or a A:-LAB. We denote the highest bid held in the LAB by H{LAB). 

That this relaxation is useful becomes immediately evident when we consider 
that a log H-LAB allows us to become constant competitive deterministically 
with the optimal algorithm in the case where we want to sell one item and we 
know the number of bids. We give the algorithm for this in Figure 0 We have 
to view this in light of the fact that in Theorem 0 we showed that in a purely 
online setting it is not possible to do better than fI{\/~B) deterministically for 
this problem. 



Algorithm LAB Sell One N 

— LAB ^ bo 

— For each bid bi for i going from 1 to A do 

— if 6i > 2.H{LAB) then LAB ■(— bi else reject bi 

— Accept the bid in the LAB. 



Fig. 5. LAB_Sell_One_N: Auctioning a single item with a buffer when the number of 
bids is known. 



Theorem 5. LAB_SelLOneJM is ^ competitive with the optimal. 

Proof: It is quite easy to see that the highest bid 6o// which is the benefit of 
the offline algorithm would not be put in the buffer only if a bid which was at 
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least half its benefit were already in there. Since a bid of at least half its benefit 
is already in the online algorithm’s buffer therefore its benefit will be at least 
half of the online’s. ■ 

Acknowledgements. The authors would like to thank Leslie Hall and two 
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A Proof of Theorem |T] 

Let the optimal offline algorithm OPT achieve profit density p on a given input 
sequence I. So if the optimal algorithm sells n' <n items, its total profit is n'p. 
Let j be the largest integer such that 2-' < 4p/5. Define a = y. We say that 
Price-And-Pack chooses i correctly, if the chosen value of i equals j. It is easy 
to see that i is chosen correctly with probability 1/logi?. In that event, bids of 
price density greater than pa are legitimate while the rest are not. Note that 
a e (2/5, 4/5]. 

Let Ip be a subset of J, comprising all bids in / which have price density 
greater than pa. 

Lemma 1. The sum of the revenues obtained by the optimal algorithm running 
on Ip is no less than n'p{l — a) where p is the profit density of OPT on I and 
n' is the number of items it sells. 

Proof: Suppose that OPT sells some nu < n' instances to bids in J — Ip, and 
let reVge be the revenue earned by OPT from items which were sold to bids in 
Ip. Clearly, 

reVge + nit. pa > n'p 

this gives us 

reVge > n'p - nu.pa 

and since nu < n' we get 

I’eVgf, > n'p{l — a) 

Since reVgi, is the revenue obtained from a subset of the bids in Ip, the result 
follows. ■ 
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Proof of Theorem ^ 

We consider the following three cases, and show that in each case the expected 

revenue of Pricc-And-Pack is at least i ■ 

10 log B 

Case 1: There is a bid of size greater than n/2 in Ip. 

With probability at least 1/log-B, Price-And_Pack chooses i correctly. With 
probability 1/2 Price-And^Pack chooses to wait for a bid of size greater than 



size n/2. Thus, with probability at least 



2 log B 



Price-And_Pack will accept a 



bid of size at least n /2 and price density at least ap. 

So in this case the expected revenue of Price -And-Pack is at least ■ 

Since the revenue earned by OPT is np, and a > 2/5, in this case Pricc-And-Pack 
is 10 log B competitive with OPT. 

Case 2: There is no bid of size greater than n/2 in Ip, and the total number of 
items demanded by the bids in Ip is more than n/2. 

With probability 1/2 Price -And -Pack will choose to accept bids of any size. 
If it also chooses i correctly (the probability of which is l/logS), it will sell at 
least n/2 instances 0 , and earn a revenue of at least pa units for every item sold. 

Thus, with probability 1/2 log B, Pricc-And-Pack sehs at least n/2 instances 
to bids whose price densities are no smaller than pa. This means that, in this 
case, the expected revenue of Pricc-And_Pack is at least 4 > T0Tc^~B’ 

which makes it 10 log i? competitive with OPT. 

Case 3: There is no bid of size greater than n/2 in Ip, and taken together the 
bids in Ip demand no more than n/2 instances. 

Again, with probability 1/2 Pricc-And-Pack decides to accept all bids, and 
with probability 1/logB, i is chosen correctly. Thus, with probability 1/2 log i? 
our algorithm accepts all bids in Ip, and, by LemmaQ earns a revenue no smaller 
than n'p{l — a) where n' is the number of items sold by OPT. So its expected 



revenue is at least ^ 2 log I^^ 
with OPT in this case. 



> 



lO^og B ’ makes it 10 log i? competitive 



B Proof Sketch for Theorem 0] 

We use Yao’s Principle to show a lower bound for all randomized al- 

gorithms. To do this we give a probability distribution over the input space 
and determine the expected benefit of the best possible deterministic algorithm 
on this probabilistic input. The competitiveness of this expectation against the 
expected benefit of the optimal offline algorithm for this probabilistically dis- 
tributed input will be, by Yao’s Principle, a lower bound on the competitiveness 
of any randomized algorithm for this problem. Due to lack of space we sim- 
ply describe the probability distribution on the inputs here. The entire proof is 
available in jS|. 

® If Pricc-And-Pack accepts all bids in Ip, it sells at least n/2 instances. If it rejects 
any bid in Ip, it must not have enough capacity left to satisfy it. But then at least 
n/2 instances must have been sold, since any bid in Ip — in particular the rejected 
bid — is of size no more than n/2. 



Seller-Focused Algorithms for Online Auctioning 147 



Consider the following input sequence which we will be calling the base se- 
quence or Ibase- {1)2,4,... B/2, B}. Our set of input sequences will be derived 
from Iba.se by truncating it at a given point and substituting bids of revenue 1 for 
the tail of the input sequence. All other inputs occur with probability 0. The set 
of inputs, I = |/i, l 2 . . . /logs} 0 {//} and associated probabilities are described 
as: 

~ li = (1, 2,4, ... 2*, 1 .. . 1} occurs with probability Pi = E^ch A has 

log i? -I- 1 bids. 

~ // = {!} occurs with probability Pf = ^ . 
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Abstract. We present a competitive analysis of the LRFU paging al- 
gorithm, a hybrid of the LRU (Least Recently Used) and LFU (Least 
Frequently Used) paging algorithms. We show that the competitive ratio 
of LRFU is fe -I- |~ *°^og 1 “ "^here | < A < 1 is the decay parameter 

used by the LRFU algorithm, and k is the size of the cache. This supplies, 
in particular, the first natural paging algorithms that are competitive but 
are not optimally competitive, answering a question of Borodin and El- 
Yaniv. Although LRFU, as it turns out, is not optimally competitive, it 
is expected to behave well in practice, especially in web applications, as 
it combines the benefits of both LRU and LFU. 



1 Introduction 

Paging (cache replacement) algorithms had been extensively studied and de- 
ployed. Two important applications are operating systems virtual memory pag- 
ing and caching of Web content. The input to a paging algorithm is a sequence 
of requests to different pages and the size of the cache. The algorithm decides 
which page to evict when a request arrives for a page which is not presently in 
the cache and the cache is full. Often, the choice of a paging algorithm can have 
a significant effect on performance. 

Two important paging algorithms are Least Recently Used (LRU) and Least 
Frequently Used (LFU). LRU is arguably the most commonly deployed in prac- 
tice. One such example is the popular Squid |3 Web caching software. When 
LRU has to evict a page from its cache, it chooses to evict the least-recently 
requested page. LRU exploits temporal locality in request sequences and the 
recency property which states that recently-requested objects are more likely to 
be requested next. LFU is another policy that seems to perform well on Web 
request sequences. LFU evicts the page that was requested the fewest number 
of times. LFU comes in two flavors: in-cache LFU which counts the number of 
times a page was requested since it entered the cache, and perfect LFU which 
counts the total number of requests made to the page since the start of the se- 
quence. LFU exploits the frequency property of request sequences which states 
that pages that were requested more times are more likely to be requested next. 
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Recent studies (e.g. | 2 |) comparing cache replacement algorithms concluded 
that LRU and LFU exhibit comparable performance on Web request sequences. 
In general, performance depends on the interplay between the recency and fre- 
quency properties of the particular sequence. Frequency seems to matter more 
for larger caches and recency has a more pronounced impact with smaller cache 
sizes. Similar behavior was observed for page request sequences in operating sys- 
tems 0. Motivated by these observations, Lee et al. 0 defined a spectrum of 
hybrid policies between LRU and LFU. They named these policies LRFUa, 
with the parameter A varying between A = 0 (LRU) and A = 1 (LFU). Their 
experiments based on simulations using page requests generated by different ap- 
plication programs demonstrated that the performance curve as a function of A 
is smooth and the dependence on A is bitonic: the miss-rate first decreases and 
then increases. Thus typically LRFU a outperforms at least one of the endpoints 
and with the best choices of A, LRFUa often outperforms both LRU and LFU. 
Their results show that LRFUa does indeed provide a desirable spectrum in 
terms of performance. 

Competitive analysis had been the leading theoretical tool for analyzing the 
performance of paging algorithms ^ Pegging algorithm A is said to have a 
competitive ratio of at most c if on any request sequence, the number of misses 
of A is at most c times the number of misses of the optimal offline algorithm, plus 
some constant. A miss occurs when a requested page is not in the cache. The 
competitive ratio of LRU and LFU demonstrates both strengths and shortcom- 
ing of the ability of competitive analysis to capture actual behavior. The LRU 
algorithm is optimally competitive. That is, its competitive ratio of k is the best 
possible by any deterministic paging algorithm (fc is the maximum number of 
pages that can fit in the cache). On the other hand, the factor k obtained on 
worst-case adversarial sequences is very far from the typical ratio on actual se- 
quences. Moreover, LFU, which performs comparably to LRU on Web sequences 
has an unbounded competitive ratio (is not competitive). Actual sequences, as 
opposed to worst-case sequences, typically exhibit the recency and frequency 
properties exploited by LRU and LFU. 

Interestingly, most natural deterministic paging algorithms fall in one of these 
two categories, either they are optimally competitive like LRU, or they are not 
competitive like LFU. An open question posed by Borodin and El-Yaniv [Q 
(open question 3.2 on page 43) is the existence of a natural deterministic paging 
algorithm with a finite but not optimal competitive ratio. It seems that most con- 
ceivable hybrids of a non-competitive algorithm with an optimally-competitive 
algorithms (such as partition the cache between the two policies) are not com- 
petitive. 

Our main contribution here is a tight competitive analysis of LRFUa. Our 
analysis reveals that as A increases we obtain all integral values between the 
competitive ratio of LRU (k) and that of LFU (oo). This solves the open problem 
posed by Borodin and El-Yaniv. The full spectrum of values also suggests that 
LRFUa is the “right” hybrid of LRU and LFU. In this sense the competitive 
analysis supports and complements the experimental results. 
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2 The LRFUa Paging Algorithm 



The LRFUa paging policy, for 0 < A < 1, is defined as follows: 

1. Each page p currently in the cache has a value v{p) associated with it. 

2. Upon a request for a page q, the following operations are performed: 

a) If a page has to be evicted to make room for q, then the page with the 
smallest value is evicted. (Ties are broken arbitrarily.) The new page q 
is temporarily given the value v{q) •<— 0. 

b) The values of the pages in the cache are updated as follows: 



v{p) ^ 



Xv{p) if p yf Q, 
1 + Xv{p) if p = q. 



It is easy to see that if a page p is in the cache at time t and if the hits to 
this page, since it last entered the cache, were at times t — ji,t — j 2 , ■ ■ ■ ,t — jn, 
then its value at time t is v{p) = Note that the value v{p) of a page is 

always smaller than ~ T^- 

In particular, we get that if 0 < A < then LRFUa behaves just like LRU. 
And, if A = 1, then LRFUa behaves just like LFU. It is well known that LRU 
has an optimal competitive ratio of k, while LFU is not competitive. We show 
that as A increases from | to 1, the competitive ratio of LRFUa increases from 
k to oo and assumes all possible integral values in this range. 



3 Competitiveness of LRFUa 



We obtain the following result: 



Theorem 1. 

A < 1, is 



The competitive ratio o/ LRFU a for a cache of size k, where \ < 



log(l - A) 
log 



- 1 . 



Another way of stating this result is as follows: the competitive ratio of 



LRFUa is k + £, where £ = 



log(l-A) 
log A 



— 1. Note that > 1 while < 1. 



Thus, £ has the following property: if p was not referenced in the last £+1 time 
units, then v{p) < 1, and £ is the smallest number with this property. Theorem ^ 
follows from Lemmas Dl and El that we obtain next. 



3.1 Upper Bound on the Competitive Ratio 

Lemma 1. For every ^ < A < 1, LRFUa is {k + £)-competitive, where £ = 

log(l-A) ] _ . 
log A 
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< = / misses < = k misses 



Fig. 1. LRFUa incurs at most k + i misses per fc-phase. 



Proof. As in the classical analysis of LRU (see Chapter 3), we break the 
sequence of requests into k-phases. A fc-phase is a maximal contiguous block of 
requests in which at most k different pages are requested. The first /c-phase starts 
at the beginning of the sequence and ends just before the (A: + l)-st distinct page 
is referenced. The second block begins immediately after the first block and so 
on. It is easy to see that any caching policy must incur, on average, at least 
one miss for each fc-phase. (More precisely, any caching policy must incur at 
least one miss on each shifted k-phase, where a shifted fc-phase is a sequence of 
requests that begins with the second request of an ordinary fc-phase and ends 
with the first request of the next ordinary fc-phase.) To show that LRFUa is 
(fc -I- £)-competitive, we show that LRFUa incurs at most k + i misses on each 
fc-phase. 

We first claim that under the LRFUa policy, a page p that enters the cache 
after the first i requests of a fc-phase, stays in the cache until the end of the 
phase. This follows from the interpretation of i given after the statement of 
Theoremni When p enters the cache, and assigned the value 1, the value v{q) of 
a page q that is currently in the cache, but was not referenced yet in the current 
fc-phase, must satisfy v{q) < 1. If a miss occurs in the fc-phase after p enters 
the cache, then there is at least one page q in the cache that was not referenced 
yet in the fc-phase. (If all fc pages in the cache were referenced in the current 
phase then a subsequent miss would end the phase.) As v{q) < v(p), p will not 
be evicted. 

It follows, therefore, that at most i misses occur at the first 1 requests of 
each fc-phase (at most one miss per request), and at most fc misses occur at the 
rest of the fc-phase (at most one miss per each page that enters the cache after 
the £-th request). (See Figure 01 ) This completes the proof of the lemma. 

Note that if A = | then £ = 0 (i.e., u(p) > 1 if and only if p is the last page 
referenced), and Lemma^states that the competitive ratio of LRFU 1/2 = LRU 
is fc. Lemma n is therefore an extension of the classical competitive analysis of 
the LRU policy. An easy extension of Lemma 01 is the following: 



Lemma 2. For every 5 < A < 1 and 1 < h < 
fc is {k + £)/{k — h + 1)- competitive, where £ = 
adversary that uses a cache of size h. 



fc, LRFUa with a cache of size 
*°fog ~ versus an offline 



Proof. Follows from the fact any algorithm that uses a cache of size h must incur 
at least fc — fc -I- 1 misses on each shifted fc-phase. 



152 E. Cohen, H. Kaplan, and U. Zwick 



3.2 Lower Bound on the Competitive Ratio 



We now show that the bound given in Lemmas is tight. 



Lemma 3. For every 



£ = 



log(l-A) 
log A 



1 

2 



< A < 1, LRFUa is at most {k + £)-eompetitive, where 



Proof. For every fc > 1 and A < A < 1, we construct an infinite sequence of 
requests such that the ratio between the number of misses incurred by LRFUa 
and the minimum possible number of misses needed to serve the sequence ap- 
proaches k + £. As in the classical case of LRU, only fc -|- 1 distinct pages are 
needed for this purpose. 

Our sequences are again partitioned into fc-phases. The state of the cache at 
the end of each fc-phase is isomorphic to the state of the cache at the beginning 
of the phase. (This is explained below in more detail.) We show that LRFUa 
incurs k + £ misses in each phase. As there is a total of only fc -|- 1 pages, the 
sequence of requests can be served with only one miss per fc-phase. (At the first 
miss of a phase, the optimal algorithm evicts the page that is not going to be 
referenced in that phase.) The claim of the lemma would thus follow. 

Let L (for Large) be a sufficiently large number such that after L consecutive 
requests to a page p we have v{p) > \~^ and v{q) < 1 for any page q such that 
q p. Such a number exists as by the definition of £ we have X~^ < . 

The first initializing fc-phase is composed of the following page requests: 



P2,P3,P4,---,Pk+l , 

where pf stands for L consecutive requests of the page pi. At the end of 
this phase, the pages in the cache, in decreasing order of their values, are 
Pk+i,Pk, ■ ■ ■ ,P 2 , and their values satisfy v{pk+i) > A“^ and v{pi) < 1, for 
2 < i < k. 

The second phase in now composed of the following page requests: 

P{1) T P(2) ^ ■ ■ ■ yP(^k+() ; 

where the parentheses around the indices indicate that they should be interpreted 
modulo fc (i.e., p^k+i) = Pi. and more generally p(j) = Pi+(i_i) mod k- 

What happens to the cache when this sequence of requests is served? As in 
the beginning of the phase < v{pk+i) < (the last inequality 

follows from the definition of £), we get that during the first £ requests of the 
phase, page Pk+i still has the largest value among all the pages of the cache, and 
the page entering the cache becomes the page with the second largest value. The 
page evicted is always the next page requested, and thus each request causes a 
miss. 

When the (£-1- l)-st request of the phase is served, the value of v{pk+i) drops 
below 1. The page entering the cache now becomes the page with the largest 
weight. The rank of Pk+i in the cache decreases by 1 at each step. The page 
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Fig. 2. A typical phase of the worst-case sequence for LRFUa 



evicted is again the next page requested and thus each request still causes a 
miss. While serving the request for page Pk+i is finally evicted from the 

cache. 

Overall, there are k + i misses in this phase. The content of the cache during 
such a phase is depicted in Figure El (In the figure, it is assumed, for simplicity, 
that i < k. The argument given above does not rely on this assumption.) 

Now, the state of cache at the end of the second phase is similar to its state 
at the end of the first phase. One page, namely p(^k+£), has value larger than A“^, 
while all other pages have values that are less than 1. We can therefore use a 
sequence of requests similar to the one used in the second state, and again create 
a fc-phase in which LRFUa incurs k + £ misses. This process can be repeated 
over and over again, resulting with the promised infinite sequence of requests. 



4 Open Problems 



Web objects may vary significantly in both size and cost of a miss. This moti- 
vated the development of cache replacement algorithms that account for varying 
page sizes and fetching costs [SIBEE)- Experiments showed that the optimally- 
competitive Landlord algorithm performs well on Web caching sequences 1X131 . 
Some experiments, however, show that it can still be outperformed by perfect 
LFU E|. Thus, it would be interesting to extend our results to this more general 
situation and obtain natural hybrid policies. 
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Abstract. Admission control (call control) is a well-studied online prob- 
lem. We are given a Hxed graph with edge capacities, and must process 
a sequence of calls that arrive over time, accepting some and rejecting 
others in order to stay within capacity limitations of the network. In the 
standard theoretical formulation, this problem is analyzed as a benefit 
problem: the goal is to devise an online algorithm that accepts at least 
a reasonable fraction of the maximum number of calls that could possi- 
bly have been accepted in hindsight. This formulation, however, has the 
property that even algorithms with optimal competitive ratios (typically 
O(logn)) may end up rejecting the vast majority of calls even when it 
would have been possible in hindsight to reject only very few. 

In this paper, we instead consider the goal of approximately minimizing 
the number of calls rejected. This is much more natural for real-world 
settings in which rejections are intended to be rare events. In order to 
avoid trivial lower-bounds, we assume preemption is allowed and that 
calls are given to the algorithm as fixed paths. We show that in a number 
of cases, we can in fact achieve a competitive ratio of 2 for rejections (so 
if the optimal in hindsight rejects 0 then we reject 0; if the optimal rejects 
r then we reject at most 2r). For other cases we get worse but nontrivial 
bounds. For the most general case of hxed paths in arbitrary graphs 
with arbitrary edge capacities, we achieve matching upper and 

lower bounds. We also show a connection between these problems and 
online versions of the vertex-cover and set-cover problems (our factor-2 
results give 2-approximations to slight generalizations of the vertex cover 
problem, much as show hardness results for the beneht version 

based on the hardness of approximability of independent set). 



1 Introduction 

In the well-studied admission control (or call control) problem, our job is to 
manage a network G (a graph with edge capacities) in the presence of online 
requests for communication (calls) . Requests for communication may be accepted 
or rejected, and the goal of an online algorithm is to accept as many as possible 
while staying within the edge capacities of the network. 
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This problem has typically been studied as a benefit problem. That is, one 
compares the number of calls that could have been accepted in hindsight to the 
number actually accepted by the online algorithm, and tries to minimize this 
ratio. A number of papers have produced good bounds for this metric, such as 
the work of Awerbuch, Azar and Plotkin pAAP9d| for the high-capacity setting, 
and Awerbuch et al. [AdLUDf) for trees and other specific networks. A serious 
problem with viewing call-control as a benefit problem, however, is that even 
with, say, an O(logn) competitive ratio that would normally be considered quite 
good, it is possible that the algorithm may route only a l/(logn) fraction of the 
calls even if a solution routing nearly all of them is possible LJ For many of the 
natural applications of admission control, even a modest constant fraction of 
rejections would be deemed unacceptable performance. Thus, for these types of 
applications, the benefit formulation appears fundamentally flawed. 

In this paper we depart from the benefit metric and instead set our sights 
on the goal of minimizing the number of calls rejeeted. That is, if OPT (the 
optimal strategy in hindsight) rejects 0 then we should reject 0. If OPT rejects 
a small number, then we should reject only a small multiple of that. What 
we show is that for several natural cases, we can in fact achieve a competitive 
ratio of 2 for rejections. For other versions we can achieve worse but still non- 
trivial bounds. Of course, approximately minimizing rejections suffers from the 
reverse problem that the algorithm may accept no calls even if in hindsight it 
was possible to accept, say, half of them. However, in many applications, even 
optimal performance in such a case would be unacceptable: if one’s network 
required one to reject a significant fraction of calls, then the correct response 
would be to upgrade the network. It is these types of settings that motivate our 
work. 

We assume in our results that the online algorithm is allowed preemption: at 
any time we may preempt (reject) requests that had previously been accepted, 
and simply count it as if the request had been rejected from the start. This is 
one of the standard models and is necessary to achieve any nontrivial bound on 
rejections. A second assumption we make is that each request is for a fixed path. 
That is, the requests can be thought of as a sequence of paths pi,P 2 , • • ■> and the 
decision made by the online algorithm is just whether to accept or reject each 
path, and does not involve routing. Again, if routing is part of the algorithm’s 
job, then even in very simple settings, no nontrivial bound is possible for our 
performance metric!' Our results are then as follows: 

^ In fact, prior to Leonard! et al. the situation was even worse. Depending 

on the types of requests made, many of the randomized algorithms would, with 
probability 1 — l/(logn), accept no calls at all. That is, the variance of possible 
benefits was high compared to the expectation. 

^ Consider a 4-cycle ABCD with capacity c on each edge. Imagine that we are given c 
calls connecting the diagonally opposite nodes A and C, and then we are given either 
c calls connecting A and B, or else c calls connecting A and D, with equal probability. 
Every on-line algorithm rejects c/2 calls in expectation, while it was possible to reject 
none off-line. Similarly, with n separate 4-cycles, the on-line algorithm rejects nc/2 
calls in expectation, while OPT rejects none. 
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Admission control on a line: When the underlying graph is a line, we can 
achieve a competitive ratio of 2 for any set of edge capacities. That is, the 
algorithm will reject at most twice as many as the minimum possible in 
hindsight. 

Admission control on a general graph: For general graphs, we can achieve 
a competitive ratio of 2 if all edge capacities are 1 (the disjoint paths case). 
This extends to a ratio of c + 1 if all edge capacities are < c. For arbitrary 
capacities, we give a different algorithm that achieves a competitive ratio of 
0{y/m), where m is the number of edges, which we match with an 
lower bound. 

An interesting aspect of the rejection measure is that the easiest cases are when 
capacities are low. This is the opposite of the situation for the benefit measure, 
where low capacities are difficult and higher capacities make the problems easier. 

1.1 Related Work 

As discussed above, existing work on admission control has primarily focused on 
the problem of maximizing the number of accepted calls, rather than minimizing 
the number of rejected calls. (See the surveys by Plotkin [IPlohhj and Leonard! 
lESHHI.) The one exception we are aware of is the work of Kamath, Palmon, and 
Plotkin [KPP9K] . who provide performance guarantees in terms of a competitive 
ratio on rejections. Their setting is quite different from ours, however. Most 
significantly, they assume the input to be probabilistically generated, not worst- 
case. They consider calls that are generated according to a Poisson process, 
with exponentially distributed holding times, and also assume the maximum 
bandwidth of a call to be very small relative to the available edge capacity (i.e. 
large capacities). Moreover, their model does not allow pre-emption. 

Routing on fixed paths, as we consider here, was studied by Alon, Arad, and 
Azar jA A A99j under the traditional measure of maximizing benefit. Of course, in 
linear networks, and more generally in tree networks, one is necessarily routing 
on fixed paths; for trees, work of Awerbuch et al. |ABFR.94j provides O(logn)- 
competitive algorithms for for maximizing benefit. (See also the earlier work of 
ll0092lOOK^ for linear networks, and the improved probabilistic guarantees 
obtained by Leonard! et al. EMSEEH-) 

A number of previous papers have considered the performance gains obtain- 
able by allowing pre-emption, in the context of maximizing benefit. Adler and 
Azar show that allowing pre-emption leads to an 0(l)-competitive algo- 

rithm for benefit maximization in linear networks, when the benefit of a call is 
defined to be proportional to the bandwidth it consumes. 

1.2 Notation and Definitions 

We are given a graph G, which may be directed or undirected, with m edges 
and n nodes. Each edge e has an integer capacity Cg > 0. We are also given a 
sequence of requests, pi,p 2 , • ■ each of which is a simple path in the graph. Each 



158 A. Blum, A. Kalai, and J. Kleinberg 



path may either be accepted or rejected. The requirement is that for every edge 
e, the number of unrejected requests that have edge e should be no larger than 
Ce- We will call a set of rejection decisions valid if it satisfies this requirement. 

In the off-line problem, we must simply find a small valid set of rejections. In 
the on-line problem, we are given requests one at a time, and we much choose 
to accept or reject the requests on-line so that the set of accepted requests 
never exceeds the capacity of any edge. We also allow our online algorithm to 
preempt an earlier request, i.e. we may reject a request after already accepting 
it. However, we may not accept a request after rejecting it. 

Let OPT be a minimum valid set of rejections. We say that an algorithm is 
fc-competitive (fc may be a function of m, n, and c) if the number of requests 
rejected by this algorithm is at most fc|OPT|. 

One final note: Our algorithms will sometimes decide to reject some requests 
even when not strictly necessary. Because we have preemption, these can always 
be implemented in a lazy manner. That is, such requests are marked but not 
actually rejected until a new request arrives that causes a conflict with it. 



2 Preliminaries: Set-Cover and Vertex-Cover 

A well-known result for the set-cover problem is that if every point is in at 
most k sets, then there is a simple /c-approximation algorithm: pick an arbitrary 
uncovered point, take all < k sets that cover it, and repeat. The case k = 2 
corresponds to vertex cover. 

A slight generalization of the k = 2 case is a setting in which a point may 
potentially be covered by many sets si,S 2 ,..., but where we are guaranteed 
that some two of those sets Si,Sj cover their union. Then one can achieve a 
2-approximation as follows: pick an arbitrary uncovered point p, find two sets 
that cover the union of all sets covering p, take those two sets and repeat. This 
is a 2-approximation because each time two sets are chosen, they can be charged 
to whatever set Sp in the optimal solution is used to cover p. Because the two 
sets chosen by the algorithm contain Sp, we are guaranteed that each selection 
of two sets is charged to a unique set in the optimal cover. 

Some of the results below can be viewed as an online version of this algorithm 
and guarantee. 

3 Admission Control on a Line 

We begin with the special case of a line graph. Each edge e has some arbitrary 
capacity Cg. A request corresponds to an interval on this line and the capacities 
limit the number of intervals covering any given edge that may be accepted. We 
show a 2-competitive algorithm, based on the set-cover idea above. The idea is 
that whenever a new request cannot be accepted due to capacity constraints, we 
look at (an arbitrary) one of the edges that would go over capacity, and throw 
out the two requests pi and pr covering that edge that extend farthest to the 
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left and farthest to the right, respectively. (One of these may or may not be the 
current request.) We then accept the current request if we did not throw it out. 
To be more precise: 

1. If a request can be accepted, accept it. 

2. If a request cannot be accepted, then choose an arbitrary edge e that would 
be put over capacity. 

a) Among the unrejected requests that contain e (including the current 
request), let pi be one that extends furthest to the left. 

b) Among the unrejected requests that contain e (including the current 
request), let Pr be one that extends furthest to the right. 

3. Reject pi and Pr, and accept the current request if it is not one of {pi^pr}. 



Theorem 1. The above algorithm is 2- competitive. 

Proof. Consider some optimal valid rejection set OPT. Each time the algorithm 
rejects a pair of requests {pi,Pr}, we will modify OPT by adding at most 1 
request to it, in order to maintain an invariant that OPT is a superset of the 
requests rejected by the online algorithm. We do this as follows. Each time 
the online algorithm reaches case 2, we know that OPT must have rejected at 
least one request Popt of those being considered by the online algorithm (i.e., at 
least one of those covering edge e that have not yet been rejected by the online 
algorithm). Therefore, when the online algorithm rejects pi and Pr, we know that 
(viewing paths as sets of edges) pi Upr O popt. Therefore, if we put pi and Pr into 
OPT, and then remove Popt if neither pi nor p,. had been in OPT already, this 
only adds 1 to the size of OPT, maintains its status as a valid rejection set, and 
maintains our invariant. So, if OPTinit is the true offline optimal, OPT final is 
the final OPT set achieved by the above transformation, and t is the number of 
requests rejected by the online algorithm, then t < |OPT/i„oi| < \OPTinit\+t/2, 
and therefore f < 2|OPTj„ii|. □ 

Another way of viewing this argument is that each time the algorithm rejects 
two requests pi and Pr, we give OPT a “two-fer”, allowing it to reject those two 
requests for the price of 1. Since OPT must reject some request contained in 
their union, it might as well take the offer. Inductively, at the end of the game, 
OPT has rejected the exact same set as the online algorithm, but at half the 
cost. 

The above algorithm and analysis also applies if the underlying graph is a 
cycle. 

4 General Graphs 

4.1 The Low Capacity Case 

Theorem 2. On a general graph G, if every edge e has capacity Cg < c, then 
there is a simple (c+ 1)- competitive algorithm. 
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Proof. The algorithm is just an online version of the fc-approximation to set 
cover: 

1. If a request can be accepted, accept it. 

2. If a request cannot be accepted, then choose the first edge e that would be 
over capacity. Reject the current request along with the Ce (unrejected) other 
requests that contain e. 

This algorithm rejects sets of requests of size < c + 1 that all share an edge. 
These sets are disjoint. Any valid rejection set must include at least one request 
from each of these sets. Therefore, the algorithm achieves a competitive ratio of 
c+ 1. □ 

Thus, if all edges have capacity 1 (the disjoint paths case) we have a 2- 
competitive algorithm. 



4.2 General Capacities 

The above algorithm gets worse as the capacities in the graph become large. 
Can we achieve a bound independent of the capacities for general graphs? The 
connection to set-cover suggests that perhaps we could achieve an O(logm) 
bound. However, it turns out that the online nature of the problem makes that 
impossible. What we show instead are a set of matching 0{y/rn) upper and lower 
bounds. We begin with the lower bound. 

Theorem 3. There is a ^2{y/m) lower hound on the competitive ratio of any 
online algorithm for general graphs with arbitrary capacities. This holds for ran- 
domized algorithms as well. 

Proof. For clarity, we will use a multigraph for the lower bound. The multigraph 
consists of fc-l- 1 vertices {0, 1, . . . , fc} arranged in a line, with k edges connecting 
each vertex to the next. So the total number of edges is k^. Each edge has 
capacity kf~'". 

We begin by seeing k^ paths of length k, one for each possible route between 
vertex 0 and vertex k. By design, these will fill all edges exactly to capacity. We 
then see k single-edge paths: the first path is a random edge between vertex 0 
and vertex 1; the second is a random edge between vertex 1 and vertex 2, and 
so on. 

The offline algorithm needs only to reject one path, namely the path among 
the first that happens to match the sequence of k single-edge paths seen at 
the end. However, any online algorithm must reject at least k/2 in expectation. 
That is because if j < k paths have been rejected so far, then the next single- 
edge path seen has at least a (fc — j)/k chance of causing its edge to go over 
capacity. Therefore, the competitive ratio of any online algorithm is at least k/2 
which is □ 
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One can also give the above argument using a standard (non-multi) graph. 
For example, we can have the underlying graph be the complete graph and 
initially see all nl Hamiltonian paths that (by design) fill all edges exactly to 
capacity. We then see n edges one at a time that together make up a random 
Hamiltonian path. The optimal offline algorithm again rejects just one path: 
namely, the Hamiltonian path among the first n! corresponding to the final 
sequence of n edges. However, any online algorithm will need to reject at least 
f2{n) paths in expectation. 

We now give a matching 0{y/m) upper bound. Specifically, we present a 
4-ym-competitive algorithm for an arbitrary multigraph with m edges. The al- 
gorithm is as follows, starting with zero “chips” on every edge and R = 0. 

1. If a request covers at least edges that have chips, then 

a) Reject the request 

b) Remove one chip from each of the request’s edges (that have chips) 

2. If a request cannot be accepted (some edge would be over capacity), then 

a) Reject the request 

b) R=R+1 

c) If i? is a multiple of yRn, then 

i. Add a chip to each edge 

ii. Reapply Step 1 to every accepted request so far. 

3. Else, accept the request. 



Theorem 4. The above algorithm is 0{y/m) -competitive. 

Analysis. Observe that the number of rejections from Step 1 (line la) is no 
more than the number of rejections from Step 2 (line 2a). This is because, after 
R Step-2 rejections, we have placed no more than R/ y/m chips on each edge, 
and every Step-1 rejection removes at least ^/m chips. Thus, to show that the 
algorithm is 4i/ro-competitive, we will show that R < 2\OPT\^/m. From here 
on, when we refer to a rejection, we mean a Step-2 rejection. 

Just before we perform each rejection, we “blame” it on an individual edge 
in one of OPT’s rejections, as follows. Let e be the first of the edges that would 
have gone over capacity had we accepted the request. We blame the rejection on 
the first OPT rejection that has not yet been rejected, has e, and has not yet 
been blamed for e. Not only must there be some such OPT rejection to blame, 
but it must come no later in the request sequence than the current request. To 
see this, say we have had e as a blame edge t times before, and we have rejected 
r of OPT’s rejections that have e. Then, including the current request, we must 
have seen Ce r 1 1 requests that contain e. OPT must also reject at least 

r -|- t -|- 1 requests with e, and we have rejected r of these and blamed t of them 
for e, leaving at least 1 previous OPT rejection to blame. 

Also notice that after \OPT\ chips have been removed from an edge, we will 
not blame any more rejections on that edge. This is because the total number 
of requests that have an edge does not exceed Ce -I- \OPT\. Finally, it suffices 
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to show that no OPT rejection is blamed for more than y/m rejections after 
R = \OPT\y/m. At this point, we have placed |OPT| chips on each edge. Fix 
an OPT rejection. Look at the first rejection after R = \OPT\y/m blamed on it. 
Since we did not reject the OPT rejection in Step 1, it has less than y/m edges 
with chips. All of its other edges must have had at least \OPT\ chips removed, 
so they will never blamed again. Thus, we will blame at most y/m edges on each 
OPT rejection after R = \OPT\y/m, which implies R < 2\OPT\y/m and the 
total number of rejections is at most ‘i\OPT\y/m. 

5 The Offline Case 

It is interesting to consider the offline version of our problem because of its 
connection to set-cover. For the off-line problem, we know the excess of each 
edge, i.e. the number of calls that include that edge minus its capacity. Let Ue be 
the excess of edge e. Our goal is to reject the fewest requests such that each edge 
e is contained in at least rie rejections. This can be thought of as generalization 
of a set cover problem, where each point e has an associated number rZg, and 
instead of the usual goal of covering each point at least once, a legal cover must 
cover point e at least rze times. 

Let us define N = Ue, that is, N is the total sum of the excesses. Then, 
the usual analysis of greedy set-cover gives us an O(logfV) approximation to 
this generalized problem. In particular, if we imagine placing chips on point 
e, then the greedy set-cover algorithm becomes: take the set that covers the most 
points of those with chips on them, remove one chip from each point covered, 
and repeat. If the optimal solution uses k sets, then at each step, the greedy 
algorithm must remove at least a 1/fc fraction of the chips remaining, giving us 
the 0{logN) ratio. 

A natural question is whether this upper bound can be improved to 0(log m) 
where m is the number of points (edges) . This would be strictly better than what 
is achievable for the online problem. It is not clear if the greedy algorithm can 
be used to achieve this, but we can get O(logm) via randomized rounding as 
follows. 

Formulate the problem as a linear programming problem, where 0 < fi < 
1 is the fraction of set Si to take (the fraction of the ith request to reject). 
Our objective is to minimize ^ fi subject to the chip (capacity) constraints 
for every point e. The minimum value of the objective will be no 
larger than k, the value of the optimal integer solution. To round, we choose each 
set independently, with probability of choosing set Si equal to min(l, 5/i log m). 

Next, Chernoff bounds imply that a given point e is covered at least Ue times 
with probability > 1 — Ifmf. To see this, first note that we do not have to worry 
about those sets that have 5/i log to > 1, because we will select them for certain. 
So ignore these sets. Let be the sum of the fi for the remaining sets Si that 
have e, and we can assume Zg > 1 else we are already done. We expect to select 
5ze log TO sets covering e, and the only way in which we could fail is if the number 
we actually select is less than Ze- By the multiplicative Chernoff bounds, this 
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happens with probability less than 

- w? 

for sufficiently large m. Thus with probability at least 1 — mlm? we will have all 
points covered the desired number of times. Furthermore, the expected number 
of sets chosen is 0{klogm), as desired. 

6 Conclusions 

We have shown that in a number of natural cases, we can achieve good, or at least 
nontrivial bounds for minimizing the number of rejections in admission control. 
One open question left by these results is that our n{y/m) lower bound requires 
an exponential number of requests and exponential size capacities. Perhaps one 
might be able to achieve bounds that are logarithmic in m if we also allow 
logarithmic dependence on the maximum capacity c. 
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Abstract. The general secure multi-party computation problem is when 
multiple parties (say, Alice and Bob) each have private data (respectively, 
a and 6) and seek to compute some function /(a, b) without revealing 
to each other anything unintended (i.e., anything other than what can 
be inferred from knowing f{a,b)). It is well known that, in theory, the 
general secure multi-party computation problem is solvable using circuit 
evaluation protocols. While this approach is appealing in its general- 
ity, the communication complexity of the resulting protocols depend on 
the size of the circuit that expresses the functionality to be computed. 
As Goldreich has recently pointed out using the solutions derived 
from these general results to solve specific problems can be impractical; 
problem-specific solutions should be developed, for efficiency reasons. 
This paper is a first step in this direction for the area of computational 
geometry. We give simple solutions to some specific geometric problems, 
and in doing so we develop some building blocks that we believe will be 
useful in the solution of other geometric and combinatorial problems as 
well. 



1 Introduction 

The growth of the Internet opens up tremendous opportunities for cooperative 
computation, where the answer depends on the private inputs of separate enti- 
ties. These computations could even occur between mutually untrusting entities. 
The problem is trivial if the context allows the conduct of these computations by 
a trusted entity that would know the inputs from all the participants; however if 
the context disallows this then the techniques of secure multi-party computation 
become very relevant and can provide useful solutions. 

In this paper we investigate how various computational geometry problems 
could be solved in a cooperative environment, where two parties needs to solve 
a geometric problem based on their joint data, but neither wants to disclose its 
private data to the other party. Some of the problems we solve in this framework 
are: 

Problem 1. (Point-Inclusion) Alice has a point z, and Bob has a polygon P. 
They what to determine whether z is inside P, without revealing to each other 
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any more than what can be inferred from that answer. In particular, neither of 
them allowed to learn such information about the relative position of z and P 
as whether 2 is at the northwest side of P, or whether z is close to one of the 
borders of P, etc. 



Problem 2. (Intersection) Alice has a polygon A, and Bob has a polygon B] they 
both want to determine whether A and B intersect (not where the intersection 
occurs) . 



Problem 3. (Closest Pair) Alice has M points in the plane, Bob has N points in 
the plane. Alice and Bob want to jointly find two points among these M + N 
points, such that their mutual distance is smallest. 



Problem 4- (Convex Hulls) Alice has M points in the plane. Bob has N points 
in the plane. Alice and Bob want to jointly find the convex hulls for these M + N 
points; however, neither Alice nor Bob wants to disclose any more information 
to the other party than what could be derived from the result. 

Of course all of the above problems, as well as other computational geome- 
try problems, are special cases of the general Secure Multi-party Computation 
problem Generally speaking, a secure multi-party computation problem 

deals with computing a function on any input, in a distributed network where 
each participant holds one of the inputs, ensuring that no more information is 
revealed to a participant in the computation than can be computed from that 
participant’s input and output. 

In theory, the general secure multi-party computation problem is solvable 
using circuit evaluation protocol mm- While this approach is appealing in its 
generality, the communication complexity of the protocol it generates depends 
on the size of the circuit that expresses the functionality F to be computed, and 
in addition, involves large constant factors in their complexity. Therefore, as 
Goldreich points out in |^, using the solutions derived by these general results 
for special cases of multi-party computation can be impractical; special solu- 
tions should be developed for special cases for efficiency reasons. This is a good 
motivation for seeking special solutions to computational geometry problems, 
solutions that are more efficient than the general theoretical solutions. 

Due to page limitations, we include detailed solutions to only two of the above 
problems: point-inclusion problem and intersection problem. Our work assumes 
that all parties are semi-honest; informally speaking, a semi-honest party is one 
who follows the protocol properly with the exception that it keeps a record of 
all its intermediate computations and might try to derive other parties’ private 
inputs from the record. We also assume that adding a random number to an x 
effectively hides x. The assumption is known to be true in a finite field, in the 
infinite case, our protocols can be considered heuristic or approximation. 
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1.1 Related Work 

The history of the multi-party computation problem is extensive since it was 
introduced by Yao [SI and extended by Goldreich, Micali, and Wigderson ||, 
and by many others. These works use a similar methodology: each functionality 
F is represented as a Boolean circuit, and then the parties run a protocol for 
every gate in the circuit. While this approach is appealing in its generality and 
simplicity, the protocols it generates depend on the size of the circuit. This 
size depends on the size of the input and on the complexity of expressing F as & 
circuit. If the functionality F is complicated, using the circuit evaluation protocol 
will typically not be practical. However, if T" is a simple enough functionality, 
using circuit a evaluation protocol can be practical. 

The existing protocols listed below serve as important building blocks in our 
solutions. Our paper [51 contains some primitives for general scientific problems, 
that could be used as subroutines by some of our computations (as special cases), 
however the next section will give better solutions for the special cases that we 
need than the general ones given in |5| (more on this later). 

The Circuit Evaluation Protocol. In a circuit evaluation protocol, each 
functionality is represented by a Boolean circuit, and the construction takes this 
Boolean circuit and produces a protocol for evaluating it. The protocol scans 
the circuit from the input wires to the output wires, processing a single gate in 
each basic step. When entering each basic step, the parties hold shares of the 
values of the input wires, and when the step is completed they hold shares of 
the output wire. 

1-out-of-AT Oblivious Transfer. Goldreich’s circuit evaluation protocol uses 
the 1-out-of-A^ Oblivious Transfer, and our protocols in this paper also heavily 
depends on this protocol. An 1-out-of-iV Oblivious Transfer protocol m refers 
to a protocol where at the beginning of the protocol one party. Bob has N inputs 
Xi, . . . , Ajv and at the end of the protocol the other party, Alice, learns one of 
the inputs Xi for some 1 < i < TV of her choice, without learning anything 
about the other inputs and without allowing Bob to learn anything about i. An 
efficient 1-out-of-X Oblivious Transfer protocol was proposed in Cl by Naor 
and Pinkas. By combining this protocol with the scheme by Gachin, Micali and 
Stadler 0, the 1-out-of-X Oblivious Transfer protocol could be achieved with 
polylogarithmic (in n) communication complexity. 

Homomorphic Encryption Schemes 

We need a public-key cryptosystems with a homomorphic property for some of 
our protocols: Ek{x)*Ek{y) = Ek{x+y). Many such systems exist, and examples 
include the systems by Benaloh Pj, Naccache and Stern ^|, Okamoto and 
Uchiyama Paillier m, to mention a few. A useful property of homomorphic 
encryption schemes is that an “addition” operation can be conduced based on 
the encrypted data without decrypting them. 
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Yao’s Millionaire Problem 

This is another protocol used as a primitive in our solutions; The purpose of 
the protocol is to compare two private numbers (i.e., determine which is larger). 
This private comparison problem was first proposed by Yao and is referred 
as Yao’s Millionaire Problem (because two millionaires wish to know who is 
richer, without revealing any other information about their net worth). The 
early cryptographic solution by Yao US) has communication complexity that is 
exponential in the number of bits of the numbers involved, using an untrusted 
third party. Cachin proposed a solution j3j based on the <?-hiding assumption. 
His protocol uses an untrusted third party that can misbehave on its own (for the 
purpose of illegally obtaining information about Alice’s or Bob’s private vectors) 
but does not collude with either participant. The communication complexity of 
Cachin’s scheme is 0{£), where £ is the number of bits of each input number. 

2 New Building Blocks 

In this section, we introduce two secure two-party protocols, a scalar product 
protocol, and a vector dominance protocol. Apart from serving as building blocks 
in solving the secure two-party computational geometry problems considered 
later in the paper, these two protocols are of independent interest and will be 
useful in solving other problems as well. 

2.1 Scalar Product Protocol 

Our paper [51 contains a matrix product protocol that could be used for scalar 
product (as a special case), but the scalar product protocol given below is better 
than using the general matrix multiplication protocol. We use X ■ Y to denote 
the scalar product of two vectors X = {xi,. . . , Xn) and Y = (yi, . . . , j/„), X-Y = 
Sfe=i XkVk- Our definition of the problem is slightly different more general: We 
assume that Alice has the vector X and Bob has the vector Y, and the goal 
of the protocol is for Alice (but not Bob) to get X -Y + v where v is random 
and known to Bob only (of course without either side revealing to the other the 
private data they start with) . Our protocols can easily be modified to work for 
the version of the problem where the random v is given ahead of time as part 
of Bob’s data (the special case u = 0 puts us back in the usual scalar product 
definition). The purpose of Bob’s random v is as follows: If Y • Y is a partial 
result that Alice is not supposed to know, then giving her X -Y + v prevents 
Alice from knowing the partial result (even though the scalar product has in fact 
been performed); later, at the end of the multiple-step protocol, the effect of v 
can be effectively “subtracted out” by Bob without revealing v to Alice (this 
should become clearer with example protocols that we later give). 

Problem 5. (Scalar Product Problem) Alice has a vector X = (xi, . . . ,x„) and 
Bob has a vector Y = (yi, . . . , j/n)- Alice (but not Bob) is to get the result of 
u = X ■ Y + V where u is a random scalar known to Bob only. 

We have developed two protocols, and we will present both of them here. 
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Scalar Product Protocol 1. Consider the following naive solution: Alice sends 
p vectors to Bob, only one of which is X (the others are arbitrary). Then Bob 
computes the scalar products between Y and each of these p vectors. At the 
end Alice uses the 1-out-of-fV oblivious transfer protocol to get back from Bob 
the product of X and Y. Because of the way oblivious transfer protocol works, 
Alice can decide which scalar product to get, but Bob could not learn which one 
Alice has chosen. There are many drawbacks to this approach: If the value of X 
has certain public-known properties. Bob might be able to differentiate X from 
the other p — 1 vectors, but even if Bob is unable to recognize X his chances of 
guessing it is an unacceptably low 1 out of p. 

The above drawbacks can be fixed by dividing vector X into m random 
vectors Vi, . . . ,Vm of which it is the sum, i.e., X = Alice and Bob can 

use the above naive method to compute Vi -Y + where Vi is random number 
and Y17=i G = (see Figure [Q. As a result of the protocol, Alice gets Vi-Y + ri 
for i = 1, . . . ,TO. Because of the randomness of V) and its position. Bob could 
not find out which one is Vi. Certainly, there is 1 out p possibility that Bob 
can guess the correct Vi, but since X is the sum of m such random vectors, the 
chance that Bob guesses the correct A is 1 out p"*, which could be very small if 
we chose p™ large enough. 



Alice 

private input: x 



vl v2 v3 v4 



X = vl+v2+v3+v4 



vl» y+rl, v2» y+r2, 
v3* y+r3, v4* y+r4 



* 



hiding vl,v2,v3,v4 
among random numbers 



1-out-of-N 
Oblivious Transfer 



Bob 

private input y 
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^V4 
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^v3 
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o 


o 


o 



Alice gets: 



X • y + V = (vl* y + rl) + (v2 • y + r2) + (v3 • y + r3) + (v4* y + r4) 



Fig. 1. Scalar Product Protocol 1 



After Alice gets Vi-Y + Vi for i = 1, . . . , n, she can compute ■V + i"i) 

= X ■ Y + V. The detailed protocol is described in the following: 

Protocol 1 (Two-Party Scalar Product Protocol 1) 

Inputs: Alice has a vector X = {x\, . . . ,Xn), and Bob has a vector Y = 

(yi,---,2/n)- 
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Outputs: Alice (but not Bob) gets X -Y + v where u is a random scalar known 
to Bob only. 

1 . Alice and Bob agree on two numbers p and m, such that p™ is so large that 
conducting p™ additions is computationally infeasible. 

2. Alice generates m random vectors, V\, . . . ,Vm, such that X — 

3. Bob generates m random numbers ri, ... ,rm such that v = 

4. For each j = 1, ... ,m, Alice and Bob conduct the following sub-steps: 

a) Alice generates a secret random number k, 1 < k < p. 

b) Alice sends {Hi, . . . ,Hp) to Alice, where Hk = Vj, and the rest of Hi’s 
are random vectors. Because A: is a secret number known only to Alice, 
Bob does not know the position of Vj . 

c) Bob computes Zjj = Hi - Y -j- rj for i = 1, ... ,p. 

d) Using the 1-out-of-A^ Oblivious Transfer protocol, Alice gets Zj= Zj^k= 
Vj ■ Y + rj, while Bob learns nothing about k. 

5. Alice computes u = Zj = X ■ Y + v. 

How is privacy achieved: 

— If Bob chooses to guess, his chance of guessing the correct X is p™. 

— The purpose of Vj is to add randomness to Vj ■ Y, thus preventing Alice from 
deriving information about Y. 



Scalar Product Protocol 2. In the following discussion, we define tt{X) as 
another vector whose elements are random permutation of those of vector X. 

We begin with two observations. First, a property of the scalar product X ■ Y 
is that tt{X) ■ tt{Y) = X ■ Y, regardless of what tt is. Secondly, if Bob sends a 
vector 7t{V) to Alice, where tt and V are known only to Bob, Alice’s chance of 
guessing the position of any single element of the vector U is 1 out of n (n is the 
size of the vector); Alice’s chance of guessing the positions of all of the elements 
of the vector U is 1 out of nl. 

A naive solution would be to let Alice get both tt{X) and Tr{Y) but not tt. 
Let us ignore for the time being the drawback that Alice gets the items of Y 
in permuted order, and let us worry about not revealing tt to Alice: Letting 
Alice know Tr{X) allows her to easily figure out the permutation function tt from 
knowing both X and tt{X). In order to avoid this problem, we want to let Alice 
know only -k{X Rh) instead of tt{X), where Rb is a random vector known only 
to Bob. Because of the randomness of AT -|- Rb, to guess the correct tt, Alice’s 
chance is only 1 out of nl. Therefore to get the final scalar product. Bob only 
needs to send 7r(y) and the result of Rb-Y to Alice, who can compute the result 
of the scalar product by using 

X -Y = tt{X + Rb) ■ tt{Y) -Rb-Y 



Now we turn our attention to the drawback that giving Alice tt{Y) reveals 
too much about Y (for example, if Alice is only interested in a single element 
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of the vector Y , her chance of guessing the right one is an unacceptably low 1 
out of n). One way to fix this is to divide Y to m random pieces, Vi, . . . , Vm, 
with Y = Vi -I- . . . -I- Vm] then Bob generates tt random permutations tti, . . . , 
(one for each “piece” Vi of Y) and lets Alice know •Ki{Vi) and TTi{X + Rt) for 
i = 1, . . . ,m. Now in order to guess the correct value of a single element of Y, 
Alice has to guess the correct position of Vi in each one of the m rounds; the 
possibility of a successful guessing becomes 1 out of n™. 

Now, let us consider the unanswered question: how could Alice get Tr{X + Rb) 
without learning tt or i?f,? We do this with a technique based on a homomor- 
phic public key system, that was used in ™ ^ different context (to compute 
the minimum value in a vector that is the difference of Alice’s private vector 
and Bob’s private vector). Recall that an encryption scheme is homomorphic if 
Ek{x) * Ek{y) = Ek{x+y). A good property of homomorphic encryption schemes 
is that “addition” operation can be conduced based on the encrypted data with- 
out decrypting them. Based on the homomorphic public key system, we have the 
following Permutation Protocol (where, for a vector Z = (zi, . . . , z„), we define 
E{Z) = (A(zi), . . . , A(z„)), D{Z) = {D{zi ), . . . , D{zn)))- 

Protocol 2 (Permutation Protocol) 

Inputs: Alice has a vector X. Bob has a permutation tt and a vector R. 
Output: Alice gets tt{X -|- R). 

1. Alice generates a key pair for a homomorphic public key system and sends the 
public key to Bob. The corresponding encryption and decryption is denoted 
as E{-) and £>(•)• 

2. Alice encrypts X = (xi,...,x„) using her public key and sends E{X) = 
(E{xi ), . . . , E{xn)) to Alice. 

3. Bob computes E{R), then computes E{X) * E{R) = E{X + R); Bob then 
permutes E{X -|- R) using the random permutation function tt, thus getting 
tt{E{X + R)); Bob sends the result of tt{E{X + R)) to Alice. 

4. Alice computes D{tt{E{X + R))) = n{D{E{X + R))) = tt{X + R). 

Based on Secure Two-Party Permutation Protocol, we have developed the 
following scalar product protocol: 

Protocol 3 (Secure Two-Party Scalar Product Protocol 2) 

luputs: Alice has a secret vector X, Bob has a secret vector Y. 

Output: Alice gets X ■ Y v where u is a random scalar known to Bob 
only. 

1. Bob’s set up: 

a) Bob divides Y to m random pieces, s.t. Y = V\ Vm- 

b) Bob generates m random vectors i?i, . . . , Rm, let v = ‘ Ri- 

c) Bob generates m random permutations tti, . . . , tt^. 

2. For each z = 1, ..., m, Alice and Bob do the following: 

a) Using Secure Two-Party Permutation Protocol, Alice gets TTi{X Ri) 
without learning either or Ri. 
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b) Bob sends 7Ti{Vi) to Alice. 

c) Alice computes Zi = ni{Vi) ■ TTi{X + Ri) =Vi - X + Vi ■ Ri 

3. Alice computes u = = Yh=i Vi ■ X + Yh=i Vi ■ Ri = X ■ Y + v 

How is privacy achieved: 

— The purpose of Ri is to prevent Alice from learning 

— The purpose of is to prevent Alice from learning Vi . Although Alice learns 
a random permutation of the Vi, she does not learn more because of the 
randomness of Vi. Without Alice could learn each single value of Vi. 

— If Alice chooses to guess, in order to successfully guess all of the elements in 
Y , her chance is (^)™. 

— Alice’s chance of successfully guessing just one elements of Y is n'". For 
example, in order to guess the /cth element of Y , Alice has to guess the the 
corresponding elements in TTi(Vi) for all * = 1, . . . , m; for each i, the chance 
is 

n 

— A drawback of this protocol is that the information about J^i^i Vi disclosed 
because the random permutation does not help to hide this information. 



Comparison of These Two Protocols. The communication cost of Proto- 
col 0 is 4m * n, where m is a security parameter (so that /r' = n™ is large 
enough). The communication cost of Protocol Q] is p * t * n, where p > 2 and t 
are security parameters such that p" = p* is large enough. Setting fj! = p" = p 
for the sake of comparison, the communication cost of Protocol 13 is dlog/ij^^^^ 

and the communication cost of Protocol Q] is When n is large. Protocol 

0is more efficient than Protocol Ql 



2.2 Secure Two-Party Vector Dominance Protocol 

Definition 1 (Vector Dominance) Let A = (oi, . . . , a„) and B = (h\, . . . , bn)', if 
for all i = 1, . . . , n we have Oi > bi, then we say that A dominates B and denote 
it by A B. 



Problem 6. (Secure Two-Party Vector Dominance Problem) Alice has a vector 
A = (ai, . . . , o„) and Bob has a vector B = (5i, . . . , 6„). Alice wants to know 
whether A dominates B. Note in the case where A does not dominate B, neither 
Alice nor Bob should learn the relative ordering of any individual ai, bi pair (i.e., 
whether ai < bi or not). 



The Protocol. We first give an outline of the protocol, then discuss each step 
in details. 

Protocol 4 (Secure Two-Party Vector Dominance Protocol) 

Inputs: Alice has a vector A = (ai, . . . , a„). Bob has a vector B = {bi, . . . , bn). 
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1. Inputs Disguise: Using a disguise technique (described later), Alice gets the 
disguised input A' = (a'^, . . . ,a 4 „), and Bob gets the disguised input B' = 
{b[,. . . Let Va 

2n 2n 

Va = 

2. Private Permutation: Bob generates a random permutation tt and a random 

vector R. Using the Permutation Protocol (Protocol |2|), Alice gets A" = 
7t(A' -I- R). Bob also computes B” = tt{B' + R), = tt{Va)- 

3. Yao’s Millionaire Comparison: Alice and Bob use Yao’s Millionaire protocol 
as subroutine to compare A" with B” , for i = 1, . . . ,4n, where A” (resp., 
B") is the Ah element of vector A” (resp., B"). At the end, Alice gets the 
result U = {ui, . . . , it 4 „}, where Mi = 1 if A" > B" , otherwise Ui = 0. 

4. Dominance Testing: Alice and Bob use a private equality-testing protocol to 
compares U with If [/ = then A dominates B] otherwise, A does not 
dominate the B. (Note: when we later use this protocol for the intersection 
protocol, this step must be skipped.) 

Outputs: If the Dominance Testing step needs to be skipped, Alice outputs 
U and Bob outputs V^. Otherwise, Alice and Bob each output the dominance 
testing results. 

Step 1: Inputs Disguise 

For convenience, we assume and bi for i = I, . . . ,n are integers; however 
our scheme can be easily extended to the non-integer case. The disguised inputs 
are the followings: 

Al = (2ai, . . . , 2a„, (2oi -I- 1), ... , (2a„ + 1), 

— 2ai, . . . , — 2a„, — (2oi -I- 1), ... , — (2a„ -I- 1)) (1) 

B' = ((26i + 1), . . . , (26„ + 1), 26i, . . . , 26„, 

- (26i + 1), . . . , -(26„ + 1), -26i, . . . , -2&„) (2) 

The purpose of the inputs disguise is to get the same number of a' > b\ situations 
as that of a' < b\ situations; therefore, nobody knows how many a^’s are larger 
than bi’s and vice versa. The disguise is based on the fact that if Ui > bi, then 
2oi > 2bi + 1, (2oi + 1) > 2bi, — 2oi < —{2bi + 1), and — (2oi + 1) < —2bi, which 
generates two >’s, and two <’s. 

Step 2: Private Permutation: This step is fully discussed in Secure Two- 
Party Permutation Protocol (Protocol 0 ■ 

Step 3: Yao’s Millionaire Comparison 

Alice now has A" = 7 t(A' -|- i?) = (a", . . . , Bob has B" = tt{B' + R) = 
(6", . . . , & 4 „). They can use Yao’s Millionaire Protocol to compare each a” with 
6". Actually it is an one-side (asymmetric) version of it because only Alice learns 
the result. So at the end of this step, Alice gets U = (mi, . . . ,M 4 n), where for 
1 = 1,..., 4n, Mi = 1 if a' > 6', otherwise Ui = 0. 
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Step 4: Dominance Testing 

Because Va is exactly what C7 should be if vector A dominates B, we only 
need to find out whether U = Va- Alice cannot just send U to Bob because it will 
allow Bob to find out the relationship between and bi for each z = 1, . . . , n. So 
we need a way for Alice and Bob to determine whether Alice’s U equals Bob’s 
Va without disclosing each person’s private input to the other person. 

This comparison problem is well studied, and was thoroughly discussed by 
Fagin, Naor, and Winkler m Several methods for it were discussed in mm- 
For example, the following is part of the folklore: 

Protocol 5 (Equality-Testing Protocol) 

Inputs: Alice has U , Bob has Va- 

Outputs: [/ = Pa iff Eb{Ea{U)) = Ea{Eb{Va))- 

1. Alice encrypts U with a commutative encryption scheme, and gets Ea(U); 
Alice sends Ea{U) to Bob. 

2. Bob encrypts Ea{U), and gets Eb{Ea{U))-, Bob sends the result back to 
Alice. 

3. Bob encrypts Va, gets Eb{Va)', Bob sends Eb{Va) to Alice. 

4. Alice encrypts Eb{Va), gets Ea{Eb{Va))- 

5. Alice compares Eb{Ea{U)) with Ea{Eb{Va))- 

3 Secure Two-Party Geometric Computations 

In the following, we want to illustrate how the building blocks we studied earlier 
can be put together to solve geometric problems. Many other geometric problems 
are amenable to such solutions; and in fact we suspect that the solutions we give 
below can be further improved. 

3.1 Secure Two-Party Point-Inclusion Problem 

We will look at how the point-location problem is solved in a straightforward 
way without worrying about the privacy concern. The computation cost of this 
straightforward solution is 0(n)- Although we know the computation cost of the 
best algorithm for the point-location problem is only O(logn), we are concerned 
that the “binary search” nature of that solution might lead to the disclosure of 
partial information. Therefore, for a preliminary result, we focus on the 0{n) 
solution. The algorithm works as follows: 

1. Find the leftmost vertex I and the rightmost vertex r of the polygon. 

2. Decide whether the point p = {a, (3) is above all the edges on the lower 
boundary of the polygon between I and r. 

3. Decide whether the point a is below all the edges on the upper boundary of 
the polygon between I and r. 

4. If the above two tests are both true, then the point is inside the polygon, 
otherwise it is outside (or on the edge) of the polygon. 
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If we use fi{x,y) = 0 for the equation of the line boundary of the polygon, 
where fi{x, y) = 0 for i = 1, . . . , to represent the edges on the lower part of the 
boundary and j/) = 0 for i = to -I- 1, . . . , n represent the edges on the upper 
part of the boundary, then our goal is to decide whether fi{a,l3) > 0 for all 
i = 1, . . . , TO and fi{a, j3) < 0 for alH = to -|- 1, . . . , n. 

The Protocol. First, we need to find a way to compute fi{a, f3) without dis- 
closing Alice’s p : {a, (3) to Bob or Bob’s fi to Alice. Moreover, no party should 
learn the result of fi{a,P) for any i because that could disclose the relationship 
between the location and the edge. Since fi{a, P) is a special case of scalar prod- 
uct, we can use Secure Two-Party Scalar Product Protocol to solve this problem. 
In this protocol, we will let both party share the result of fi(a,P), namely, one 
party will have Ui, the other party will have vt, and Ui = fi{a,P)+vf, therefore 
nobody learns the value of fi{a,P), but they can find out whether fi{a,P) > 0 
by comparing whether Ui > Vi, which could be done using Yao’s Millionaire 
Protocol USE]. 

However, we cannot use Yao’s Millionaire Protocol for each (ui,Vi) pair in- 
dividually because that would disclose the relationship between Ui and Uj, thus 
reveal too much information. In fact, all we want to know is whether (rti, . . . , u„) 
dominates (t>i, . . . , v„). This problem can be solved using the Vector Dominance 
Protocol (Protocol^ 

Based on the Scalar Product Protocol and Vector Dominance Protocol, we 
have the following Secure Two-Party Point-Inclusion Protocol: 

1. Bob generates n random numbers ui, . . . , 

2. Alice and Bob use Scalar Product Protocol to compute Ui = fi{a,P) + Vi, 
for i = 1, . . . , TO and compute Ui = —fi{a, /3) -I- Vi for i = m + 1, . . . ,n. 
According to the scalar product protocol, Alice will get (mi,...,u„) and 
Bob will get (ui, . . . ,u„). Bob will learn nothing about Ui and (a,/3); Alice 
will learn nothing about Vi and the function fi{x,y). 

3. Alice and Bob use the Vector Dominance Protocol to find out whether vec- 
tor A = (mi, . . . , Un) dominates B — (ui, . . . , u„). According to the Vector 
Dominance Protocol, if A does not dominate B, no other information is 
disclosed. 

Claim. If A = {u\, . . . ,Un) dominates B = (ui, . . . , u„), then the point p = (a, P) 
is inside the polygon; otherwise, the point is outside (or on the edge) of the 
polygon. 

3.2 Secure Two-Party Intersection Problem 

Two polygons intersect if (1) one polygon is inside another, or (2) at least one 
edge of a polygon intersects with one edge of another polygon. Since (1) can be 
decided using the Point-Inclusion Protocol, we only focus on (2). 

We will first look at how the intersection problem could be solved in a 
straightforward way (0(n^)) without worrying about the privacy concern. For 
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the same reason we decide not to use the more efficient 0 {n) algorithm because 
of the concern about the partial information disclosure. The algorithm works as 
follows: 

1. For each pair (ej,e'), decide whether intersects with Cj, where is an 
edge of polygon A and e' is an edge of polygon B, 

2 . If there exists an edge ei & A and an edge e' G B, such that intersects 
with e' , then A and B intersect. 

We use fi{x, y), (xi,yi), (x', y'), for i = 1 , . . . , Ua, to represent each edge of 
the polygon A, where fi{x,y) is the equation of the line containing that edge, 
(xi, yi) and (xj, yi) represents the two endpoints of the edge. We use gi{x, y), for 
i = 1 , . . . , nt,, to represent each edge of the polygon B. 

The Protocol. During the testing of whether two edges intersect with each 
other, obviously, nobody should learn the result of each individual test; other- 
wise, he knows which of his edge intersects with the other party’s polygon. In 
our scheme, Alice and Bob conduct these n? testings, but nobody knows the 
result of each individual test, instead, they share the results of each test, namely 
each of them gets a seemly-random piece of the result. One has to obtain both 
pieces in order to know the result of each test. At the end, all these shared pieces 
are put together in a way that only a single result is generated, to show only 
whether the two polygon boundaries intersect or not. 

First, let us see how to conduct such a secure two-party testing of the inter- 
section. Assume Alice has a edge fi{x,y) = 0, where fi{x,y) = a\x + b\y + c\, 
and oi > 0; the two endpoints of the edge are (xi, yi) and (xi,y[). Bob has a line 
l2{x,y) = 0, where /2(x,y) = 02X -I- b2y + C2, 02 > 0; the two endpoints of the 
edge are (x2,y2) and (x2,y2). According to the geometries, fi and /2 intersect 
if and only if /I’s two endpoints (xi, yi), {x[,y[) are on the different sides of /2, 
and /2’s two endpoints (x2,y2) and {x^^y'^) are on the different sides of /i. In 
another words, fi and /2 intersect if and only if one of the following expressions 
is true: 

- fi{x2,y2) > 0 A /i(x2,y2) < 0 A /2(xi,yi) > 0Af2{x[,y[) < 0 

~ fi{x2,y2) > 0 A /i(x2,y2) < 0 A /2(xi,yi) < 0 A /2(x'i,yi) > 0 

~ fi{x2,y2) < 0 A /i(x2,y^ > 0 A /2(xi,yi) > 0 A /2(x'i,yi) < 0 

- fi{x 2 ,y 2 ) < 0 A /i(x2,y2) > 0 A /2(xi,yi) < 0 A /2(x'i,yi) > 0 

We cannot let either party know the results of /i(x2,y2), fi{x2,y2)j 
f2(xi,yi), or f2{x'i,y[) (in the following discussion, we will use f{x,y) to rep- 
resent any of these expressions). According to the Scalar Product Protocol, we 
can let Alice know the result of /(x,y) -I- r, and let Bob know r, where r is a 
random number generated by Bob. Therefore, nobody knows the actual value of 
/(x,y), but Alice and Bob can still figure out whether /(x,y) > 0 by comparing 
/(x, y) -I- r with r. 

Let Ml = /i(x2,y2) + xi, u[ = /i(x2,y2) + r[, U2 = f2{xi,yi) + X2, and 
■*^2 = hix'ijy'i) + r'2- Alice has (ui, u'l, M2, m^) and Bob has (ri, r(, X2, r^). Then 
fi and /2 intersect if and only if one of the following expressions is true: 
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— Ml > ri A m'i < r[ A U2 > r2 A u'2 < r'2 

— Ml > ri A m'i < r[ A M2 < T2 A m^ > 

— Ml < ri A u'l > r[ A U2 > r2 A u'2 < r'2 

— Ml < ri A m'i > r'l A U2 < T 2 A u'2 > r'2 

Our next step is to compute each of the above expressions. As before, nobody 
should learn the individual comparison results, just the aggregate. Let us use E to 
denote any one of the above expressions. Using the Vector Dominance Protocol, 
we can get Alice to know a random piece t, and Bob to know another random 
piece s, such that E is true if and only if t = s. 

Now Alice has 4 * numbers (ti, . . . , t4„2). Bob has (si, . . . , S4„2). We want 
to know whether there exists an z = 1, . . . , 4n^, such that ti = Si. Although there 
are some other approaches to achieve this, we believe using the circuit evaluation 
protocol is efficient in this case, because the size of the circuit is small (linear 
in the number of the items). The security of the circuit evaluation protocol 
guarantees that only the final results-yes or no-will be disclosed; nobody learns 
any other information, such as how many ti’s equal to s/s, and which ti = s^. 

The following is the outline of the protocol: 



1. Let m = ria* M{,. 

2. For each pair of edges, perform the following sub-protocol. Suppose the index 
of this edge pair is z, for z = 1 , . . . , m; suppose (/, (xi, z/i), {x'^, y'^)) G A and 
{9, (2:2, 2/2), (4,2/^) G B are two edges. 

a) Using the scalar product protocol, Alice gets U = (mi, m'^, M2, m^, and 
Bob gets R = (ri,r'i,r2,r2), where mi = /(x2, 2/2) + ?-i, u'l = 7(2:2, 2/2) + 
r'l, U2 = 5 ( 2 : 1 , 2/1) + f2, and u'2 = g{x'^,y'i) + r'2. 

b) Using the Vector Dominance Protocol, Alice gets ti,2, ^i,3, ^i,4, and 
Bob gets Si, 1, Si, 2, Si,3, Si,4 

3. Alice has (ti,i, ti,2, ti,3, U,4, • ■ • , ^m,i, ^m,2, ^m,3, ^m,4), and Bob has (si,i, si,2, 
•S1.3, Si,4, • ■ ■ , Sm,i, Sm,2, Sm,3, Sm,4). Alice and Bob uses circuit evaluation 
method to find out whether there exists i £ {!,..., to}, j G { 1 , . . . , 4}, such 
that ti^j — 

4 Applications 

The following two scenarios describe some potential applications of the problems 
we have discussed in this paper. 

1. Company A decided that expanding its market share in some region will 
be very beneficial after a costly market research; therefore A is planning to 
do this. However A is aware of that another competing company B is also 
planning to expand its market share in some region. Strategically, A and B 
do not want to compete against each other in the same region, so they want 
to know whether they have a region of overlap? Of course, they do not want 
to give away location information because not only does this information 
cost both companies a lot of money, but it can also cause significant damage 
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to the company if it were disclosed to other parties: for example, a larger 
competitor can immediately occupy the market there before A ov B even 
starts; or some real estate company can actually raise their price during the 
negotiation if they know A or is very interested in that location. Therefore, 
they need a way to solve the problem while maintaining the privacy of their 
locations. 

2. A country decides to bomb a location x in some other country; however, A 
does not want to hurt its relationship with its friends, who might have some 
areas of interests in the bombing region: for example, those countries might 
have secret businesses, secret military bases, or secret agencies in that area. 
Obviously, A does not want to disclose the exact location of x to all of its 
friends, except the one who will definitely be hurt by this bombing; on the 
other hand, its friends do not want to disclose their secret areas to A either, 
unless they are in the target area. How could they solve this dilemma? If each 
secret area is represented by a secret polygon, the problem becomes how to 
decide whether H’s secret point is within BA polygon, where B represents one 
of the friend countries. If the point is not within the polygon, no information 
should be disclosed, including the information such as whether the location is 
at the west of the polygon, or within certain longitude or latitude. Basically 
it is “all-or-nothing” : if one will be bombed, it knows all; otherwise it knows 
nothing. 



5 Conclusions and Future Work 

In this paper, we have considered several secure two-party computational geom- 
etry problems and presented some preliminary work for solving such problems. 
For the purpose of doing so, we have also presented two useful building blocks. 
Secure Two-Party Scalar Product Protocol and Secure Two-Party Vector Dom- 
inance Protocol. 

In the protocols for the Point-Inclusion problem and the Intersection problem, 
we use an inefficient algorithm to decide whether a point is side a polygon (or 
whether two polygon intersect) although more efficient solutions exist, because 
of the concern about information disclosure. In our future work, we will study 
how to take advantage of those efficient solutions without degrading the privacy. 
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Abstract. We consider the problem of placing a regular grid over a set 
of points in order to minimize (or maximize) the number of grid cells 
not containing any points. We give an 0(n log n) time and 0{n) space 
algorithm for this problem. As part of this algorithm we develop a new 
data structure for solving a problem on intervals that is interesting in its 
own right and may have other applications. 



1 Introduction 

The grid placement problem relates the positioning of a regular rectangular 
grid over a pointset. The focus of the problem is to place the grid and then 
count the number of grid cells containing at least one point. The problem has 
two variations. In the first variation the task is to calculate the grid position 
which minimizes the number of empty cells. The second variation calculates the 
position which maximizes the number of empty cells. Figure E shows an example 
solution for both problems. 

This grid placement problem arises in gridding/interpolation where one at- 
tempts to derive a regular grid of data points from an irregular set of sample 
points. In this setting, one wants to minimize the number of grid cells not con- 
taining any sample point. 

Another example application of this problem is in the layout of maps into 
books. In this case each grid cell corresponds to a page in the book and each 
point represents map features. In this case, minimizing the number of empty 
cells decreases the number of features on many pages. This in turn makes the 
pages easier to label and reduces the number of uninteresting pages. 

In this paper, we give an O(nlogn) time algorithm for solving the grid place- 
ment problem for a set of n points and a grid of arbitrary size. As part of this 
algorithm we develop a new data structure for solving a problem on intervals 
that is interesting in its own right and may have other applications. 

* This research was partly supported the Natural Sciences and Engineering Research 
Council of Canada, the Government of Ontario and by NCE-GEOIDE. An extended 
abstract of this paper was presented at the European Workshop on Computational 
Geometry 2001 in Berlin. 
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(a) (b) (c) 



Fig. 1. A grid placement (a) with the minimum number of empty cells, (b) with the 
maximum number of empty cells and (c) in standard position. 



In previous related work, by Asano and Tokuyama ^ and Agarwal et al. 
the problem was to minimize the maximum number of points in a given cell. In 
the problem that was considered the grid was allowed to rotate and change in 
size. Asano and Tokuyama produced a result of 0{b^n^logn) time where is 
the number of grid cells. Agarwal et al. gave a randomized algorithm which works 
in 0(min(B5n5 log^ n + Bi An\o^ n,n?)) time where the maximal number of 
points in a cell is ^ + A and B is the number of grid cells. It should be noted 
that in both of these cases the only grids considered had an equal number of 
rows and columns. 

In the following sections we give a discussion of the solution to the minimiza- 
tion problem and leave the trivial extension to the maximization problem to the 
reader. Our discussion formally defines the problem and describes and proves 
correctness of a basic algorithm. This basic algorithm is then refined into our 
final solution and analyzed. 



2 Problem Definition and Analysis 

We start by defining the set of n points in the problem to be the points P = 
{pi,P 2 , • ■ • ,Pn} embedded in the plane. The location of a point pi in the plane is 
represented as (pix,Piy)- A grid is defined as a packed tiling of regular rectangles 
called cells. The sides for each of the cells are assumed to be parallel to the x 
and y axes of the plane and have dimensions (Cx,Cy). A specific instance of a 
grid, G is defined by its size and its position. The size of G is defined by the 
number of cell columns (c) and rows (r). The position of the grid {Gx,Gy) is 
the location of the cell corner which has minimal x and y coordinates (i.e., the 
lower left corner). A grid position is allowable if and only if every point of P is 
contained in some cell of G. 

There are a variety of allowable grid positions relative to the pointset P. In 
standard position {Gsx,Gsy), as shown in Figure ^c, the uppermost and right- 
most points in P are coincident with the uppermost and rightmost boundaries of 
G, respectively. We finish the problem definition by assuming that in the stan- 
dard position the leftmost column and bottommost row of cells do not contain 
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any points. Later we will show how to remove this assumption. Thus the problem 
is to find an allowable grid position which minimizes the number of cells that do 
not contain points from P. 

Examining allowable grid positions yields the interesting result given in 
Lemma ^ 

Lemma 1. Given two distinct allowable positions a = {ax, ay) and b = {ax + 
hx, ay + hy) where hx = icx, hy = jcy and {i,j} € M , the number of non-empty 
cells is identical for both positions. 

Proof: Since both positions are allowable the grid at both positions a and 
b must contain all points. This further implies that the grid located at a must 
intersect the grid located at b and that only intersecting cells can contain points. 
Also, since i and j are integers then any cell can only intersect at most one other 
cell. Thus, the sets of cells which overlap form a one-to-one correspondence. This 
establishes a bijection between non-empty cells at a and non-empty cells at b. 
Therefore there are the same number of non-empty cells at both locations. □ 

Using Lemma n it is possible to limit the number of allowable grid positions 
that need to be considered. The exact space of allowable grid positions is defined 
in Theorem n 

Theorem 1. The only allowable grid positions that need to be considered are 
the “kernel” positions defined by the bottom, left cell in standard position. 

Proof: The proof is divided into two parts. First that all of the kernel positions 
are allowable and second that all other allowable positions are equivalent to 
some kernel position. Given that the standard position is allowable, that the 
bottommost row is empty and that the leftmost column is also empty, it is clear 
that all the kernel positions are allowable. 

It remains to be proven that all other allowable positions are equivalent to 
some kernel position. Let A be the set of allowable positions. By the definition 
of the standard position any translation of the grid down and/or to the left 
will result in at least one point pi being outside of the grid. Thus all elements 
a = {ax, ay) of A must be expressible by non-negative translation from the 
standard grid position. That is, a = {Gsx + hx, Ggy -k hy) where {hx, hy{ >= 0. 
Given Lemma H the theorem follows trivially. □ 



3 Basic Algorithm 

The algorithm itself consists of two main stages. In the first stage, we calculate 
the regions of the kernel for which each cell is empty. In the second stage, we 
overlay these empty regions and calculate a position contained in the minimum 
number of empty regions. 

For the cell we define P{C(ij)) to be the subset of P contained in 

C'(jj) at standard position. Since only the kernel positions are considered, then 

^ Each cell is labeled by it’s row and column with G(i^i) in the bottom left and 

C(r,c) in the upper right. 
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C'(jj) can only contain elements from P(C(i+ij)), P(C'(i_j_|_i)) and 

For C'(ij) to be empty at the grid position a, the cell must not 
include any element of any of these sets. 

To understand the empty region for first consider only the set P{C(^i j^). 
A point X = (X^, Xy) is dominated by the set P{C(^ij'f) if and only if there exists 
an element pi of P{C(^i j'^) such that pi^ > X^ and piy > Xy. The cell C(jj) does 
not contain an element of P{C(^ij-)) if and only if the lower left corner of C(ij) is 
not dominated by P{C(ij)). Thus the maximal elements [114] define the region 
in which the cell does not contain any points of P(C(i j)). By symmetry the same 
can be done for P(C(i+i ^)), P(C'(jj_|_i)) and P(C'(,_|_i j+i)) and the intersection of 
these four regions define the empty region for Cf^ijy See Figure Qfor an example. 












Cy+iy+i) 

Cy.j+i) 




Fig. 2. Using maximal elements to compute the empty region for Cyjy 



The intersection of these four regions is done in a particular manner to avoid 
added complexity in the second stage. First, the appropriate set of maximal 
elements is calculated for each of the four cellfl Then the intersection of the 
regions for P{Cyy)) and P{Cy+iy)) is calculated followed by the intersection for 
the remaining two regions. The result of these intersections are two orthogonal 
convex polygons. 

Each of these polygons is stored as a list corresponding to the events of 
a vertical sweep from the top to bottom of the kernel. At each event in the 
vertical sweep, the line interval resulting from the intersection of the sweep line 
and the polygon is stored. The vertical and horizontal position of the event and 
the corresponding width of the interval is stored with each of the intervals. It 
is assumed that the list of intervals is sorted by its vertical position. The two 
polygons are then intersected by merging the two lists to create a new list of 
intervals and their insertion and deletion heights. The results of such a process 
are displayed in Figure El 

Suppose |P(C'pj)) U P(C'p+ij)) U P(C'pj+i)) U P(C'p+i,j+i))| = ky^). The 
maximal elements can be computed in 0{ky j'^ log ky j-^) time [til4] . and the inter- 
section of the four empty regions can also be done in 0{ky y) time as described 
above. Furthermore, any point in P is used in the computation of the empty 
regions for at most 4 grid cells. Therefore, the overall running time thus far is 
0{J2k(ij)logkyj^) = 0(n log n). 

^ If one of the pointsets to be calculated does not exist due to grid boundary then the 
pointset can be considered empty. 
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Fig. 3. The two preliminary intersections and the final resnlt of intersecting the empty 
regions for including all intervals created. 



In the second stage of the algorithm the empty regions for all of the cells 
have to be overlayed and a position with the minimum number of empty regions 
found. This could be done by computing an arrangement of rectilinear polygons, 
but this would take J7(n^) time and space. Instead, we perform a vertical plane 
sweep Pd of the kernel from top to bottom. There are two types of events 
during the sweep, the addition of an interval at it’s maximum height and the 
deletion of an interval at it’s minimum height. At each point in time we would 
like to keep track of a point that is contained in the least number of intervals. 
The minimum such value, over all points in time, corresponds to a point in the 
kernel that is contained in the least number of empty regions. 

Since the problem has reduced to insertion and deletion of intervals in a 
sweep of the plane, then the second stage of the algorithm yields a solution by 
solving the following problem. Given that a set of one dimensional intervals I are 
inserted and deleted at times • ■ ■ ,is}, find the time t and the location 

X where the minimum number of intervals overlap. 

The algorithm will maintain a tree structure T which will answer queries 
about how many intervals are overlapping at a given x-coordinate and what the 
minimum overlapping intervals are. A complete binary search tree is created 
with a leaf for every x-coordinate in P plus the maximal x-coordinates for the 
kernel. The tree is assumed to have enough information to perform searches on 
x-coordinates {right, left, parent{p), key, . . .). Additionally, every vertex v in 
the tree will hold values stab[v] and ptr[v]. 

In a leaf vertex u, stab[u] will contain the number of intervals that overlap 
at its corresponding x-coordinate. At an internal vertex stab[v] = min(sta6[w]) 
such that u is any leaf of the sub-tree rooted at v. Each vertex v also contains 
a pointer ptr[v] to its child v' that has the minimum stab value. In all future 
discussions, we will use the notation v' to denote the child of v with minimum 
stab value. For all leaves ptr[v] = v. 

With this structure an interval is inserted by incrementing stab[u] at each 
leaf u whose key is contained by the interval (all intervals are closed). Then all 
ancestors v of the updated leaves are updated in a bottom up fashion. Deletion of 
an interval is done in an identical fashion by decrementing the necessary leaves. 

For the purpose of our algorithm and its analysis each interval update is 
further broken into two updates. An interval [xi,X 2 ] is equivalent to [xi,oo) \ 
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(x 2 ,oo). Therefore the insertion of [a;i,a; 2 ] is broken into inserting [xi,oo) and 
deleting [x 3 ,oo) where X 3 is the minimal x-coordinate greater than X 2 - Deletion 
of an interval is done similarly. Thus all of the interval updates which define the 
empty regions must be converted into two updates. 

The algorithm is run by first sorting all of the interval updates by y- 
coordinate. Two null events are also added to occur at the maximal y-coordinates 
of the kernel. Any events with identical y-coordinate are then grouped together. 
The answer to the problem is then found by processing each group of events in 
order. After each group, excepting the last group, compare the known minimum 
value with the value at the root of the tree. If it is less than the current minimum 
then make it the new minimum. Also follow the chain of vertices connected by 
ptr values from root[T] to vertex u. Record key[u] as the x-Coordinate and the 
y-coordinate of that event group as the y-coordinate of the minimum. 

4 Refined Algorithm 

The main problem with the Basic Algorithm is the complexity of each interval 
update. For any given update [x, 00 ) let the vertex chain V{x) = {vu, ■ . • , 'Ci, r'o} 
be the vertices along the path from root\T\ = Vk to the leaf uq whose key[v\ = x. 
Then consider updating the tree T with [x,oo) and a value val (val = 1 for 
insertion, val = — 1 for deletion). The basic algorithm requires every vertex in 
any right sub-tree of the path V (x) be updated. The refined algorithm reduces 
the cost of updating by postponing the updating for any such sub-tree until 
absolutely necessary. This is achieved by performing an update on the root of 
a sub-tree and setting a mark bit for that root. This mark bit signifies that the 
node has been updated and its descendents have not. 

In refining the data structure for T, a different algorithm is required to up- 
date the interval [x, 00 ) with a given value val. The refined Update-Interval 
algorithm is broken conceptually into two phases. In the first phase the path 
V (x) is followed from root to leaf performing necessary updates to T (see lines 
1-17 of Algorithm^. These necessary updates involve unmarking all vertices on 
V(x) and marking the root of all right sub-trees of V{x). The algorithm then 
updates the leaf node Vg by adding val to stab[vo\. 

The second phase of the algorithm then traverses the vertices V{x) = Vi 
from leaf to root updating ptr[vi] appropriately and updating stab[vi] to be the 
value stored in the child of Vi that has the minimal stab value (see lines 19-25 of 
Algorithm Pi . 

To prove the correctness of our algorithm we define the following variables: 




stab[v], if leaf[v] = TRUE] 

stab[v] — stab[v'], if mark[v] = TRUE] 



0 



otherwise. 



( 1 ) 



STABi[x] 



Num. of intervals overlapping x 
after the interval update. 



( 2 ) 
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Algorithm 1 Update-Interval{x,val) - Adds [x,cx)) of value val to tree T 
1: •<— root[T] 

2: d^O 

3: while leaf[v] = FALSE do 

4: if mark[v] = TRUE then 

5: Unmar k{v) (see Algorithmic 

6: end if 

7: d^d+1 

8: if key[v] < x then 

9: V -(r- right[v] 

10: else {interval begins in the left sub-tree of i)} 

11: if leaf [rig ht[v]] = FALSE then 

12: mark[right[v]] •<— TRUE 

13: end if 

14: V left[v] 

15: end if 

16: end while 

17: stab[v] •<— sta&[w] -|- val 

18: while d 7^ 0 do 

19: if left\p[v]] = v then 

20: stab[right[p[v]]] -4— stab[right[p[v]]] 4- val 

21: end if 

22: V •<— p[v] 

23: d^d-1 

24: stab[v\ 4— stab[v'\ 

25: ptr\v] 4— v' 

26: end while 



Algorithm 2 Unmark{v) 
Require: leaf[v] = FALSE 
1: (5 4— stafe[n] — stab[v'] 

2: stab[right[v]\ 4— stab\right[v]\ -\- S 
3: stab[left[v[] 4— stab[left[v]] + S 
4: if leaf[right[v]] = FALSE then 
5: mark[right[v[] ■<— TRUE 

6: mark[left[v]] 4— TRUE 

7: end if 

8: mark[v] ^ FALSE 



Property 1. The tree T is said to have Property [D after the interval update if 
and only if Equation0is true where V{x) = {vk, • ■ ■ , '^o} is a path of vertices 

from root\T] = Vk to a, leaf vq whose key[vo\ = x. 

k 

= STAB,[x\. 
j=o 



( 3 ) 
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Lemma 2. Given T such that PrnnertvYl\ holds, then the property will still hold 
after performing procedure Unmark{v) (see Algorithm\^ on any marked vertex 

V. 

Proof: Since Unmark{v) only affects the stab values of v,right[v] and left[v] 
then only Z\„, and are affected. Since t; is a marked vertex before 

U nmark(v) is run and an unmarked vertex after then any sum involving Ay must 
decrease by Ay. However, U nmark{v) also increases the values of Z\right[„] and 
^ieft[v] by the same amount and each sum must contain one of these two values. 

□ 

Theorem 2. T always has Property Q after the interval update of Algo- 
rithm H 

Proof: The proof is by induction on i. 

Base Case (After initialization): Before the first update there are no intervals in 
T. Thus if T is initialized with mark[v] — FALSE, stab[v\ = 0 and ptr[v\ = 
left[v] then Property ^is trivially true. 

Induction Hypothesis: Assume that it is true for the interval update. 

(i + 1)®* Interval Update: We prove that Property E is true for the (i + 1)®* 
update in three cases, for vertices v with key[v] = x, key[v] < x and key[v] > x. 

The first case considers the leaf Vq representing the leftmost portion of the 
interval being updated (i.e., key[vt)] = x). After line 16 all vertices in the path 
V {x) are unmarked. It follows from the definition of Ay that Ay = 0, Vi; S U (x) 
except Vo before the (i + 1)®* update. By Lemma |2| the expression ~ 

ST ABi{x) must be true before line 17. Combining these two equations with 
Equation ^yields that Ay^ = STABi{x) = stab[vo\. Thus, after line 17 stab[vo] = 
STABi{x) + val = ST ABij^x{x), which proves the first case. 

The second case considers, without loss of generality any leaf uo with 
key value less than the leftmost portion of the interval being updated (i.e., 
key[uo] = w < x). Again, it follows from the definition of Ay that Ay = 0, 
Vt> G y{x) except vq before the {i + 1)®* update. By Lemma El the expression 
Ay. = STABi{x) must be true before line 17. Let the lowest common an- 
cestor lca{uo,vo) = Ufi- By definition of leaf), all vertices {uh,Uh+i, ■ ■ ■ ,Uk} G 
V{x). Thus = ST ABi{w). Since vertices {uh-\,Uh- 2 , ■ ■ ■ jWo} are 

unaffected by the {i 1 )®* update, then after line 26: Y^’jZo = 

ST ABi{w). Since STABi{w) = ST ABij^i{w) then the second case is proven. 

The third case follows the same logic as the second case with only one dif- 
ference. On lines 19 and 20 the value of stab[uh-i] is incremented by val. Thus 
after line 26: '^“3 = Si=o = STABi{w) val. Since ST ABi+i{w) = 

val -I- STABi{w) then the third case is proven. □ 

Lemma 3. After the i*^ interval update, the pointer ptr[v] at any vertex v equals 
its child vertex v' with the minimum stab value. 
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Proof: by induction on the number of interval updates(z). 

Base Case (after initialization) : Before the first update there are no intervals in T. 
Thus if T is initialized with mark[v\ = FALSE, stab[v\ = 0 and ptr[v\ = left[v\ 
then Property n is trivially true. 

Induction Hypothesis: Assume that it is true for the interval update. 

Proof of (z + 1)®* update: During the (z + 1)®‘ update the only vertices whose 
stab values change are the vertices on the path V {x) or their children. Since the 
property holds after the z*^ update the path vertices are the only vertices whose 
ptr values may need adjustment. Since these values are adjusted on line 25 then 
the property must be true for the (z + 1)®* update. □ 

Consider the path Vk = root[T],Vk-i = ptr[vk],--- ,vq = ptr[vi] and let 
X = key[vo]. By Theorem 0 after the z*^ insertion STABi{x) = 
Furthermore, by the definition of Ay and LemmaEl Ay^ = stab{vi) — stab{vi-i). 
Therefore, 

k 

STAB,{x)=Y,Ay. (4) 

j=o 

Ay. = stab[vi] — stab[vi-i] (5) 

By using Equations 0 and 0 together: 



k 

STABi{x) = stab[vo] + stab[vi] — stab[vi-i] (6) 

i=i 

= stab[vk] (7) 

Theorem 3. Given the definition of Vk stated above, stab[vk] contains the ap- 
propriate minimal value after the z*^ interval update. 

Proof: Consider any leaf vertex uq such that key[uo] = w. By rearranging Equa- 
tion 0 we arrive at Equation 



k 



STAB,{w) = Y,^u, 
j=o 




(8) 


k 

= stab[uo] + stab[uj] 

i=i 


— stab[u'j] 


(9) 


fc-i 

= stab[uk] + stab[uj] 


— stab[u'j_^_l] 


(10) 
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By the definition of the value stab[uj\ — stab[u'j_^_i\ is minimized when 
Uj = Uj+i- By Lemma and therefore STABi{x) is minimal. 
The proof is concluded because stab[vk] = STABi{x) by Equation 0 □ 

It is clear that the running time of Update-Interval is O(logn). Furthermore, 
Eq. (PI) shows how to find a point overlapped by the least number of intervals in 
O(logn) time. Since these are exactly the operations required to solve the grid 
placement problem, we obtain our main result. 

Theorem 4. The grid placement problem can be solved in O(nlogn) time and 
0{n) space. 

All that remains to be shown is that the points in P can be preprocessed using 
this time and space to allow two specific types of queries. First, given a point 
in report all points in P(C(ij)) in 0(|P(C'(ij))|) time. Second, given 

a point in P(C(jy)), report points in P(C(ij+i)), P(Cp+ij)) and P(C'(j+i j+i)) 
in 0(1) time, returning null pointers should any of those sets be empty. 

To perform this task we use a linked-list of points in which each point will 
have pointers to points in the cells above and to the right of its cell. As described 
previously these pointers may be null if the cells mentioned are empty. If the list 
is sorted to contain points from any specific cell in a contiguous block then the 
first requirement is satisfied. This is trivially accomplished by sorting the points 
primarily by the row they fall into and secondly by the column they fall into. 
This sorting is achieved in 0(n log n) time and the structure contains a constant 
amount of space per point so the 0{n) space requirement is also fulfilled. 

It remains to be shown how to preprocess the points to obtain the required 
pointers to points in the neighbouring cells. This is accomplished by first sorting 
the points primarily by the column and then by the row they lie in. In this 
format points in vertically adjacent cells are in adjacent contiguous blocks in the 
list. By passing through the list and remembering a pointer to a point in the 
previous contiguous block it is possible to scan through the points once and set 
the appropriate pointers for the vertically adjacent cells. 

The remaining pointers to the horizontally adjacent cells can be computed 
similarly by sorting the list by row and then column. This method requires 
0(n log n) time and 0(n) space thereby proving Theorem 0 

5 Removing the Assumption 

Finally we address the cases wherein the bottom row and leftmost column of 
the grid G are not necessarily empty in standard position. The algorithm still 
applies, however the kernel of allowable positions changes. Instead of mapping 
allowable positions into the entire bottom/leftmost grid cell C'(i_i) the allowable 
positions map into a sub-region R of It is easily shown that there are only 

three configurations for R which are all shown in Figure E| 

First the algorithm must calculate R using the point(s) Pmin{x) £^nd Pmin{y) 
which have the least x and y-coordinates respectively. These points are then 
used to calculate the region R as shown in Figure 0 During the algorithm the 
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'Pmin{x) 



R 



Pmin{y) 



R 



(a) 



(b) 




Fig. 4. The two points Pmin(x) and Pmin(y) define three possible configurations for the 
region R. (a) column 1 is empty, (b) row 1 is empty and, (c) neither are empty. 



empty regions are calculated as described previously. Then the list intervals are 
modified to only include those portions which intersect R. The structure T is 
then based solely on x-coordinates which appear in the modified interval list. 

Since all of these operations require linear time in the number of points and 
involve at most constant additional space none of the previous theorems are 
changed. 

6 Conclusions 

We have studied the grid placement problem and given an 0(n log n) time and 
0{n) space algorithm. This was accomplished by reducing it to the following data 
structuring problem: Maintain a set of intervals with endpoints drawn from a 
set of 0{n) points under insertions and deletions so that at any point in time 
a point included in the least (or most) number of intervals can be reported in 
O(logn) time. We have developed a new lazy data structure for this problem 
that requires 0(n) space and operates in O(logn) time per operation (insert, 
delete and query). 

Since the acceptance of this paper, the authors have discovered a similar 
work by Lee on the interval problem. In the paper, Lee solves the maximum 
clique problem on an interval graph. That is, his structure reports the maximum 
number of overlapping intervals. In our interval problem we also provide a point 
contained in the intersection of these intervals. There is no clear indication of 
how this would be accomplished using Lee’s structure. It can be achieved by 
reporting all of the intervals contained in the maximum clique and calculating 
their intersection. However, the algorithm that Lee sketches for reporting the 
maximum clique requires 0(n log n) space and each query requires l7(fc + logn) 
time where k is the size of the maximum clique. 
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Abstract. We introduce a new measure for planar point sets S. Intu- 
itively, it describes the combinatorial distance from a convex set: The 
reflexivity p{S) of S is given by the smallest number of reflex vertices in 
a simple polygonalization of S. We prove various combinatorial bounds 
and provide efficient algorithms to compute reflexivity, both exactly (in 
special cases) and approximately (in general). Our study naturally takes 
us into the examination of some closely related quantities, such as the 
convex cover number ki(S') of a planar point set, which is the smallest 
number of convex chains that cover S, and the convex partition num- 
ber K 2 {S), which is given by the smallest number of disjoint convex 
chains that cover S. We prove that it is NP-complete to determine the 
convex cover or the convex partition number, and we give logarithmic- 
approximation algorithms for determining each. 

1 Introduction 

In this paper, we study a fundamental combinatorial property of a discrete set, 
S, of points in the plane: What is the minimum number, p{S), of reflex vertices 
among all of the simple polygonalizations of SI A polygonalization of S' is a closed 
tour on S whose straight-line embedding in the plane defines a connected cycle 
without crossings, i.e., a simple polygon. A vertex of a simple polygon is reflex 
if it has interior angle greater than tt. We refer to p{S) as the reflexivity of S. 

In general, there are many different polygonalizations of a point set, S. There 
is always at least one: simply connect the points in angular order about the 
center of mass. A set S has precisely one polygonalization, if and only if it is 
in convex position, but usually there is a great number of them. Studying the 
set of polygonalizations (e.g., counting them, enumerating them, or generating 
a random element) is a challenging active area of investigation in computational 
geometry. 

The reflexivity p{S) quantifies, in a combinatorial sense, the degree to which 
the set of points S is in convex position. See Figure H for an example. 
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Fig. 1. Two polygonalizations of a point set, one (left) using 7 reflex vertices and one 
(right) using only 3 reflex vertices. 



We conduct a formal study of reflexivity, both in terms of its combinatorial 
properties and in terms of an algorithmic analysis of the complexity of computing 
it, exactly or approximately. Some of our attention is focussed on the closely 
related convex cover number of S, which gives the minimum number of convex 
chains (subsets of S in convex position) that are required to cover all points of S. 
For this question, we distinguish two cases: The convex cover number, ni(S), is 
the smallest number of convex chains to cover S' the convex partition number, 
K 2 {S), is the smallest number of convex chains with pairwise-disjoint convex 
hulls to cover S. (Note that nested chains are feasible for a convex cover, but 
not for a convex partition.) 

Motivation. In addition to the fundamental nature of the questions and problems 
we address, we are also motivated to study reflexivity for several other reasons: 

(1) An application motivating our original investigation is that of meshes 
of low stabbing number and their use in performing ray shooting efficiently. 
If a point set S has low reflexivity or convex partition number, then it has a 
triangulation of low stabbing number, which is much lower than the general 
0{^/n) upper bound guaranteed to exist ( 111121 ). (e.g., a constant reflexivity 
implies a logarithmic stabbing number triangulation) 

(2) Classifying point sets by their reflexivity may give us some structure for 
dealing with the famously difficult question of counting and exploring the set of 
all polygonalizations of S. See CH for some references to this problem. 

(3) There are several applications in computational geometry in which the 
number of reflex vertices of a polygon can play an important role in the com- 
plexity of algorithms. If one or more polygons are given to us, there are many 
problems for which more efficient algorithms can be written with complexity in 
terms of “r” (the number of reflex vertices), instead of “n” (the total number 
of vertices), taking advantage of the possibility that we may have r <C n for 
some practical instances. (See, e.g., CM-) The number of reflex vertices also 
plays an important role in convex decomposition problems for polygons (see, 
e.g., |T7]). 

(4) Reflexivity is intimately related to the issue of convex cover numbers, 
which has roots in the classical work of Erdos and Szekeres j(Sl9| . and has been 
studied more recently by Urabe 
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(5) Our problems are related to some problems in curve (surface) reconstruc- 
tion, where the goal is to obtain a “good” polygonalization of a set of sample 
points. (See |3EE|-) 

Related Work. The study of convex chains in finite planar point sets is the topic 
of classical papers by Erdos and Szekeres p8l9j . who showed that any point set 
of size n has a convex subset of size t = f?(logn). This is closely related to 
the convex cover number ki, since it implies an asymptotically tight bound on 
Ki(n), the worst-case value for sets of size n. There are still a number of open 
problems related to the exact relationship between t and n; see, for example, m 
for recent developments. Other issues have been considered, such as the existence 
and computation (see 0) of large “empty” convex subsets (i.e., with no points 
of S interior to their hull); this is related to the convex partition number, K 2 {S). 
It was shown by Horton m that there are sets with no empty convex chain 
larger than 6, so this implies that K 2 {n) > n/6. Tighter worst-case bounds were 
given by Urabe E2ESI. 

Another possibility is to consider a simple polygon having a given set of 
vertices, that is “as convex as possible”. This has been studied in the context 
of TSP tours of a point set S, where convexity of S provides a trivial optimal 
tour. Convexity of a tour can be characterized by two conditions. If we drop the 
global condition (i.e., no crossing edges), but keep the local condition (i.e., no 
reflex vertices), we get “pseudoconvex” tours. In 0]j it was shown that any set 
with jS”! > 5 has such a pseudoconvex tour. It is natural to require the global 
condition of simplicity instead, and minimize the number of local violations - i.e., 
the number of reflex vertices. As in the paper m, this draws a close connection 
to angles in a tour, a problem that has also been studied by Aggarwal et al. | 2 |. 
We will see in Section 0 that there are further connections. 

The number of polygonalizations on n points is in general exponential in n. 
To give tight bounds on the maximum value attainable for a given n has also 
been object of intensive research (mi)- The minimum number of reflex vertices 
among all the polygonalizations of a point set S is the reflexivity of S, a concept 
we introduce in this work. 

Main Results. The main results of this work include: 

— Tight bounds on the worst-case reflexivity in a number of cases, including 
the general case and the case of onion depth 2. 

— Upper and lower bounds on reflexivity, convex cover number, convex parti- 
tion number, and their relative behavior. We obtain exact worst-case values 
for small cardinalities. 

— Proofs of NP-completeness for computing convex cover and convex partition 
numbers. 

— Algorithmic results yielding O(logn) approximations for convex cover num- 
ber, convex partitioning number, and (Steiner) reflexivity. We also give effi- 
cient exact algorithms for cases of low reflexivity. 

Throughout this extended abstract we omit many proofs and details, due to 
space limitations. We refer the reader to the full paper, available on the internet. 
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2 Preliminaries 

Throughout this paper, S will be a set of n points in the plane Let P be any 
polygonalization of S. We say that P is simple if edges may only share common 
endpoints, and each endpoint is incident to exactly two edges. Let V be the 
set of all polygonalizations of S. Note that V is not empty, since any point set 
S having n > 3 points has at least one polygonalization (e.g., the star-shaped 
polygonalization obtained by sorting points of S angularly about a point interior 
to the convex hull of S). 

A simple polygon P is a closed Jordan curve, subdividing the plane into an 
unbounded and a bounded component. We say that the bounded component 
is the interior of P. A reflex vertex of a simple polygon is a common endpoint 
of two edges, such that the interior angle between these edges is larger than 
7T. We say that an angle is convex if it is not reflex. We define r{P) to be the 
number of reflex vertices in P, and c(P) to be the number of non-reflex, i.e., 
convex vertices in P. We define the reflexivity of a planar point set S to be 
p{S) = minpg-p r(P). Similarly, the convexivity of a planar point set S is defined 
to be x(5') = maxpg-p c(P). Note that x(5') = n — p{S). 

The convex hull CH{S) of a set S is the smallest convex set that contains 
all elements of S'; the convex hull elements of S are the members of S that lie 
on the boundary of the convex hull. The layers of a point set S are given by 
repeatedly removing all convex hull elements, and considering the convex hull of 
the remaining set. We say that S has k layers or onion depth k if this process 
terminates after precisely k layers. A set S forms a convex chain (or is in convex 
position) if it has only one layer. A Steiner point is a point not in the set S 
that may be added to S in order to improve some structure of S. We define 
the Steiner reflexivity p'{S) to be the minimum number of reflex vertices of 
any simple polygon with vertex set V D S. Similarly, we can define the Steiner 
convexivity x'{S). (Furthermore, Steiner points can be required to lie within 
the convex hull, or be arbitrary. In this abstract, we do not elaborate on this 
difference.) 

We use the notation max 5 .| 5 |_„ p(S') = p{n) and x(''^) = inins:|S|=n x('S') for 
the worst-case values for point sets of size n. For a given finite set S, let Ci be 
the family of all sets of convex chains, such that each element of S is part of at 
least one chain. We say that a set of chains C G Ci is called a convex cover of 
S. Similarly, C 2 C Ci is the family of all convex covers of S for which the convex 
hulls of any two chains are mutually disjoint. Then we define the convex cover 
number k\{S) = mincgc^ \C\ as the smallest size of a convex cover of S', and the 
convex partition number K 2 {S) = mincgC2 |C'|. Again, we denote by Ki{n) and 
K 2 {n) the worst-case values for sets of size n. 

Finally, we state a basic property of polygonalizations of point sets. The proof 
is straightforward. 

Lemma 1. In any polygonalization of S, the points on the convex hull of S are 
always convex vertices, and they occur in the polygonalization in the same order 
in which they occur along the convex hull. 
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3 Combinatorial Bounds 

In this section we establish several combinatorial results on reflexivity and convex 
cover numbers. 

One of our main combinatorial results establishes an upper bound on the 
reflexivity of S that is tight in terms of the number n/ of points interior to the 
convex hull of S. Given that the points of S that are vertices of the convex hull 
are required to be convex vertices in any (non-Steiner) polygonalization of S, 
the bound in terms of n/ seems to be quite natural. 

Theorem 1. Let S be a set ofn points in the plane, nj of which are interior to 
the convex hull CH{S). Then, p{S) < \nj/2~\. 




Fig. 2. Computing a polygonalization with at most [u//2] reflex vertices. 



Proof. We describe a polygonalization in which at most half of the interior points 
are reflex. We begin with the polygonalization of the convex hull vertices that is 
given by the convex polygon bounding the hull. We then iteratively incorporate 
other (interior) points of S into the polygonalization. Fix a point po that lies on 
the convex hull of S. At a generic step of the algorithm, the following invariants 
hold: (1) our polygonalization consists of a simple polygon, P, whose vertices 
form a subset of S', and (2) all points S' C S of S that are not vertices of P 
lie interior to P; in fact, the points S' all lie within the subpolygon, Q, to the 
left of the diagonal poPi, where pi is a vertex of P such that the subchain of dP 
from Pi to Po (counter-clockwise) together with the diagonal poPi forms a convex 
polygon (Q). Define pi+i to be the first point of S' that is encountered when 
sweeping the ray PoPt counter-clockwise about its endpoint po- Then, we sweep 
the subray with endpoint Pi+i further counter-clockwise, about Pi+i, until we 
encounter another point, q, of S' . (If |S"| = 1, we can readily incorporate pt+i 
into the polygonalization, increasing the number of reflex vertices by one.) Now, 
the ray Pi+iq intersects the boundary of P at some point c £ o6 on the boundary 
of Q. 

We now modify P to include interior points Pi+i and q (and possibly others 
as well) by replacing the edge ab with the chain (a,pi+i, q,q\, . . . ,qk,b), where 
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the points qi are interior points that occur along the chain we obtain by “pulling 
taut” the chain (q, c, b) (i.e., by continuing, in the “gift wrapping” fashion, to 
rotate rays counter-clockwise about each interior point that is hit until we 
encounter b). In this way we incorporate at least two new interior points (of S') 
into the polygonalization P, while creating only one new reflex vertex (at Pi+i). 
It is easy to check that the invariants (I) and (2) hold after this step. □ 

In fact, the upper bound of Theorem HI P{S) < |"n//2], is tight, as we now 
argue based on the special configuration of points, S = S'o(n), in Figure0 The 
set So{n) is defined for any integer n > 6, as follows: |"n/2] points are placed 
in convex position (e.g., forming a regular |"n/2]-gon), forming the convex hull 
CH{S), and the remaining nj = [n/2j interior points are also placed in convex 
position, each one placed “just inside” CH{S), near the midpoint of an edge 
of CH{S). The resulting configuration So{n) has two layers in its convex hull. 
LemmaEl below, shows that p{So{n)) > |"n//2] > [n/4j. 




Fig. 3. Left: The configuration of points, So{n), which has reflexivity p(So{n)) > 
\ni/2\. Right: A polygonalization having [n//2] reflex vertices. 



Lemma 2. For any n>6, p{So{n)) > |"n//2] > [n/4j . 

Proof. Denote by Xi the points on the convex hull, i = 1,..., [ri/2], and Vi the 
points “just inside” the convex hull, i = 1, ...,\n/2\, where Vi is along the convex 
hull edge {xi,Xi+i). 

From Lemma n we know that points Xi are connected in their order around 
the convex hull, and are all convex vertices in any polygonalization. Consider an 
arbitrary pocket of a polygonalization, having lid (xj,Xj-^-i) and let mj denote 
the number of interior points that go to this pocket (if the convex hull edge 
(xj,Xj+i) belongs to the polygonalization, with a slight abuse of notation we 
can consider it as a pocket with rrij = 0). Observe that ruj > 0 implies that 
Vj belongs to this particular pocket. If mj = I the pocket contains a single 
interior point, namely Vj, and then Vj is a reflex vertex in this polygonalization. 
To complete our proof we will show that if this pocket contains more interior 
points, among them only Vj will be a convex point. 

We use the following simple fact: Given a set of points, all but one of which is 
in convex position, all polygonalizations of this set have a unique reflex vertex, 
namely the point not on the convex hull. 
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The pocket with lid {xj,Xj+\) includes points x^+i; if vj is not the only 
interior point included in this pocket, then this pocket together with the lid is 
a polygon as in the simple fact. Therefore the polygon which is the pocket has 
only one reflex vertex, Vj, and when considered “inside-out” as a pocket of the 
original polygon, only Vj among the interior points is a convex vertex. 

Therefore the number of reflex vertices in a pocket is in any case at least 
\mj/2], and we have 



p(So{n)) > \rrij/2'] 



[nj/2] > [n/4j. 



□ 



Remark. Since nj < n, the corollary below is immediate from the theorem. The 
gap in the bounds for p{n), between [n/4j and |"/r/2], remains an intriguing 
open problem. While our combinatorial bounds are tight in terms of n/ (the 
number of points of S whose convexivity/reflexivity is not forced by the convex 
hull of S), they are not yet tight in terms of n. 

Corollary 1. [n/4j < p(n) < |"n/2]. 



Steiner Points. If we allow Steiner points in the polygonalizations of S, the 
reflexivity of S may decrease; in fact, we have examples of point sets with re- 
flexivity p{S) = r, where the introduction is reduced by a factor of 2 from the 
no-Steiner case: p'{S) = Tj2. At this point, it is unclear whether this estimate 
characterizes a worst case. We believe it does: 

Conjeeture 1. For a set S of points in the plane, we have p'{S) > p{S)/2. 

In terms of the cardinality n of S, we obtain the following combinatorial 
bounds; a proof can be found in the full version of the paper: 

Theorem 2. For a set S of n points in the plane, we have p'{S) < \n/‘S\ . 

By a careful analysis of our example in Figure 0 we can show the following: 

Theorem 3. If one only allows Steiner points that are interior to CH{S), then 
any Steiner poly gonalization of So{n) has at least |"n/4] reflex vertices. 



Two-Layer Point Sets. Let S' be a point set that has onion depth 2. It is clear 
from our repeated use of the example in Figure 0 that this is a natural case that 
is a likely candidate for worst-case behavior. With a very careful analysis of this 
case, we are able to obtain tight bounds on the worst-case reflexivity in terms 
of n: 

Theorem 4. Let S be a set of n points having two layers. Then p{S) < |"U’/4], 
and this bound is tight. 
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A proof can be found in the full version of our paper. 

Furthermore, we can prove the following lower bound for polygonalizations 
with a very special structure that may be useful in an inductive proof in more 
general cases: 

Lemma 3. For a point set with onion depth 2 a polygonalization with at most 
1"^] reflex vertices exists such that none of the edges exist in the interior of the 
inner layer of the onion. 

Convex Cover Numbers. As we noted in the introduction, it was shown by Erdos 
and Szekeres |8I9| that any set of n points in the plane has a convex chain of size 
O(logn). Moreover, they have shown that there are sets of size 2* + 1 without a 
convex chain of t + 3 points. This implies the following: 

Theorem 5. Ki{n) = 0(n/logn). 

Proof. For point sets with Ati(n) = f?(n/logn), consider the sets constructed by 
Erdos and Szekeres. These have a largest chain of size O(logn), and the lower 
bound follows. 

To see that there always is a cover with 0(n/ log n) chains, consider a greedy 
cover in the following way. Let Sq = S, and for each Si, remove a largest chain, 
yielding 5'i+i. By the result of Erdos and Szekeres, each removed chain has size 
12(log l-Sil); the lower bound for the size of Si remains constant until [log |S'i|] 
decreases, i.e., until we have removed at least half of the points. Furthermore, any 
largest convex chain in Si has at least 3 points, so the iteration must terminate 
after removing a series of chains of size at least 3. This yields a total number 
of at most chains. A straightforward 

induction over q shows that < 2 — , so the claim follows. □ 

Even better bounds are known for the disjoint cover number: 

Theorem 6 (Urabe | |22j j . 

\{n - l)/4] < K 2 (n) < [2n/7] . 

The lower bound can be seen directly from our Figure the upper bound is 
the result of a detailed construction in P2|. 

Now we discuss the relationship between the different measures for a set S. 
The ratio /Ci(S') and K 2 {S) for a set S may be as big as 0(n): Our example 
in FigureOlhas ki{S) = 2, but K 2 {S) > n/4. However, there is a very tight lower 
bound of K 2 {S) in terms of p{S): 

Theorem 7. For a planar set S, we have the estimates ki{S) < K 2 {S) < p{S) + 
1, and these upper hounds are best possible. 

Proof. The upper bound for ki{S) by K 2 {S) is trivial. The upper bound for 
K 2 {S) by p{S) can be found in @). □ 

One can construct examples with 2k 2{S) = p{S), which is the worst example 
we know of, see the full paper. On the other hand, it is a surprisingly difficult 
open problem to prove that there is some bounded ratio between K 2 {S) and 
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Conjecture 2. For a set S of points in the plane, we have p{S) = 0 {k2{S)). 

However, it is not hard to see that the estimate p'{S) = 0{k2{S)) holds 
(see the proof of Corollary |2I) , so a proof of Conjecture would follow from the 
validity of Conjecture ^ 

Small Point Sets. It is natural to consider the exact values of p(n), Ki(n), and 
K 2 {n) for small values of n. Table ^ below shows some of these values, which 
we obtained through (sometimes tedious) case analysis. Oswin Aichholzer has 
recently applied his software that enumerates point sets of size n of all distinct 
order types to verify our results; in addition, he has obtained the result that 
p(10) = 3. (Values of n > 11 seem to be intractable for enumeration.) 



n 


p(n) 


Ki{n) 


K2{n) 


< 3 


0 


1 


1 


4 


1 


2 


2 


5 


1 


2 


2 


6 


2 


2 


2 


7 


2 


2 


2 


8 


2 


2 


3 


9 


3 


3 


3 



Table 1. Worst-case values of p, rei, K 2 for small values of n. 



4 Complexity 

Theorem 8. It is NP-complete to decide whether for a planar point set S the 
convex partition number K 2 {S) is below some threshold k. 

Proof. We give a reduction of Planar 3 Sat, which was shown to be NP- 
complete by Lichtenstein (see [EE])- See Figure Ejfor the overall proof idea, and 
the full paper for proof details. □ 

Theorem 9. It is NP-complete to decide whether for a planar point set S the 
convex cover number ki{S) is below some threshold k. 

Proof. Our proof uses a reduction of the problem l-in-3 SAT. (It is inspired by 
the hardness proof for the Angular Metric TSP given in 0.) See Figure El 
for the proof idea, with bold lines corresponding to appropriate dense point sets. 
Proof details can be found in the full paper. □ 

So far, the complexity status of determining the reflexivity of a point set 
remains open. However, the close relationship between convex cover number 
and reflexivity leads us to believe the following: 

Conjecture 3. It is NP-complete to determine the reflexivity p{S) of a point set. 
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Fig. 4 . (a) A straight-line embedding of the occurrence graph for the 3 Sat instance 
{xi V 5?2 V 0:3) A {x2 V X3 V a^) A (alT V *2 V afj); (b) a polygon for a variable vertex; (c) 
a point set S'j representing the Planar 3 Sat instance 7 ; (d) joining point sets along 
the odd or even polygon edges. 




Fig. 5. A point set Sj for a l-in-3 SAT instance 7. Pivot points are shown for the 
clause (xi V Xn)- 



5 Algorithms 

In this section, we provide a number of algorithmic results. Since some of the 
methods are rather technical, we can only give proof sketches in this extended 
abstract. 

Theorem 10. Given a set S ofn points in the plane, in 0(n log n) time one can 
compute a polygonalization of S having at least x(*S')/2 convex vertices, where 
x{S) is the convexivity of S. 

Proof, (sketch) The algorithm of Theorem Q is constructive, producing a polyg- 
onalization of S having at most n//2 < n/2 reflex vertices, and thus at least 
n/2 convex vertices (thereby giving a 2-approximation for convexivity). In order 
to obtain the stated time bound, we must implement the algorithm efficiently. 
We utilize a dynamic convex hull data structure (iia)> to be able to obtain the 
points q, qi, . .. ,qk, efficiently (in amortized O(logn) time per point). □ 
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Theorem 11. Given a set S of n points in the plane, the convex cover number, 
ni{S), can be computed approximately, within a factor ofO{logn), in polynomial 
time. 

Proof, (sketch) We use a greedy set cover heuristic. At each stage, we need to 
compute a largest convex subset among the remaining (uncovered) points of S. 
This can be done in polynomial time using the dynamic programming methods 

of HO]. □ 

Theorem 12. Given a set S ofn points in the plane, the convex partition num- 
ber, K 2 (S), can be computed approximately, within a factor of 0{logn), in poly- 
nomial time. 

Proof, (sketch) Let C* = {Pi, . . . ,Pk*} denote an optimal solution, consisting 
of k* = K 2 {S) disjoint convex polygons whose vertices are the set S. Following 
the method of pni, we partition each of these polygons into O(logn) vertical 
trapezoids whose a;-projection is a “canonical interval” (one of the 0{n) such 
intervals determined by the segment tree on S). 

For the algorithm, we use dynamic programming to compute a minimum- 
cardinality partition of S into a disjoint set, C' , of (empty) convex subsets whose 
x-projections are canonical intervals. Since the optimal solution, C*, can be 
converted into at most k* ■ O(logn) such convex sets, we know we have obtained 
an 0(log n)-approximate solution to the disjoint convex partition problem. □ 

Corollary 2. Given a set S of n points in the plane, its Steiner reflexivity, 
p'{S), can be computed approximately, within a factor of O (log n), in polynomial 
time. 

A proof can be found in the full paper. 

Special Gases. For small values of r, we have devised particularly efficient al- 
gorithms that check if p{S) < r and, if so, produce a witness polygonalization 
having at most r vertices. Of course, the case r = 0 is trivial, since that is equiv- 
alent to testing if S lies in convex position (which is readily done in O(nlogn) 
time, which is worst-case optimal). It is not particularly surprising that for any 
fixed r one can obtain an algorithm, e.g., by enumerating over all r-element 
subsets that correspond to reflex vertices, along with all possible neighboring seg- 
ments incident on these vertices, etc. The factor in front of r in the exponent, 
however, is not so trivial to reduce. In particular, the straightforward method 
applied to the case r = 1 gives O(n^) time. With a more careful analysis of the 
cases r = 1, 2, we obtain: 

Theorem 13. Given a set S of n points in the plane, in O(nlogn) time one 
can determine if p{S) = 1, and, if so, produce a witness polygonalization. Fur- 
thermore, J7(nlogn) is a lower bound. 

For r = 2, a careful analysis of how two pockets can interact also yields a 
very efficient algorithm; in the full paper we prove: 

Theorem 14. Given a set S of n points in the plane, in 0{n^) time one can 
determine if p{S) = 2, and, if so, produce a witness polygonalization. 



On the Reflexivity of Point Sets 203 



Acknowledgments. We thank Adrian Dumitrescu for valuable input on this 
work. The collaboration between UPC and SUNY Stony Brook was made pos- 
sible by a grant from the Joint Commission USA-Spain for Scientific and Tech- 
nological Cooperation Project 98191. E. Arkin acknowledges support from the 
NSF (CCR-9732221) and HRL Laboratories. S. Fekete acknowledges travel sup- 
port from the Hermann-Minkowski-Minerva Center for Geometry at Tel Aviv 
University. F. Hurtado, M. Noy, and V. Sacristan acknowledge support from 
CUR Gen. Cat. 1999SGR00356, and Proyecto DGES-MEC PB98-0933. J. Mit- 
chell acknowledges support from HRL Laboratories, NSF (CCR-9732221), NASA 
(NAG2-1325), Northrop-Grumman, Sandia, Seagull Technology, and Sun Mi- 
crosystems. 

References 

1. P. K. Agarwal. Ray shooting and other applications of spanning trees with low 
stabbing number. SIAM J. Comput., 21 , 540-570, 1992. 

2. A. Aggarwal, D. Coppersmith, S. Khanna, R. Motwani, and B. Schieber. The 
angular-metric traveling salesman problem. In Proceedings of the Eighth Annual 
ACM-SIAM Symposium on Discrete Algorithms, pages 221-229, Jan. 1997. 

3. N. Amenta, M. Bern, and D. Eppstein. The crust and the /3-skeleton: Combinatorial 
curve reconstruction. Graphical Models and Image Processing, 60 , 125-135, 1998. 

4. B. Chazelle. Computational geometry and convexity. Ph.D. thesis. Dept. Comput. 
Sci., Yale Univ., New Haven, CT, 1979. Carnegie- Mellon Univ. Report CS-80-150. 

5. T. K. Dey and P. Kumar. A simple provable algorithm for curve reconstruction. 
In Proc. 10th ACM-SIAM Sympos. Discrete Algorithms, pages 893-894, Jan. 1999. 

6. T. K. Dey, K. Mehlhorn, and E. A. Ramos. Curve reconstruction: Connecting 
dots with good reason. In Proc. 15th Annu. ACM Sympos. Comput. Geom., pages 
197-206, 1999. 

7. D. P. Dobkin, H. Edelsbrunner, and M. H. Overmars. Searching for empty convex 
polygons. Algorithmica, 5, 561-571, 1990. 

8. P. Erdos and G. Szekeres. A combinatorial problem in geometry. Compositio 
Math., 2, 463-470, 1935. 

9. P. Erdos and G. Szekeres. On some extremum problem in geometry. Ann. Univ. 
Sci. Budapest, 3 - 4 , 53-62, 1960. 

10. S. P. Fekete and G. J. Woeginger. Angle-restricted tours in the plane. Comp. 
Geom. Theory AppL, 8, 195-218, 1997. 

11. A. Garcia, M. Noy, and J. Tejel. Lower bounds for the number of crossing-free 
subgraphs of Kn. In Proc. 7th Canad. Conf. Comput. Geom., pages 97-102, 1995. 

12. J. Hershberger and S. Suri. A pedestrian approach to ray shooting: Shoot a ray, 
take a walk. J. Algorithms, 18 , 403-431, 1995. 

13. S. Hertel and K. Mehlhorn. Fast triangulation of the plane with respect to simple 
polygons. Inf. Control, 64 , 52-76, 1985. 

14. J. Hershberger and S. Suri. Applications of a semi-dynamic convex hull algorithm. 
BIT, 32 , 249-267, 1992. 

15. J. Horton. Sets with no empty convex 7-gons. Canad. Math. Bull., 26 , 482-484, 
1983. 

16. F. Hurtado and M. Noy. Triangulations, visibility graph and reflex vertices of a 
simple polygon. Comput. Geom. Theory AppL, 6, 355-369, 1996. 




204 



E.M. Arkin et al. 



17. J. M. Keil. Polygon decomposition. In J.-R. Sack and J. Urrutia, editors, Hand- 
book of Computational Geometry, pages 491-518. Elsevier Science Publishers B.V. 
North-Holland, Amsterdam, 2000. 

18. D. Lichtenstein. Planar formulae and their uses. SIAM J. Comput., 11 , 329-343, 
1982. 

19. J. S. B. Mitchell. Approximation algorithms for geometric separation problems. 
Technical report. Department of Applied Mathematics, SUNY Stony Brook, NY, 
July 1993. 

20. J. S. B. Mitchell, G. Rote, G. Sundaram, and G. Woeginger. Gounting convex 
polygons in planar point sets. Inform. Process. Lett., 56, 191-194, 1995. 

21. J. Pach (ed.). Discrete and Computational Geometry, 19, Special issue dedicated 
to Paul Erdos, 1998. 

22. M. Urabe. On a partition into convex polygons. Discrete Appl. Math., 64, 179-191, 
1996. 

23. M. Urabe. On a partition of point sets into convex polygons. In Proc. 9th Canad. 
Conf. Comp. Geom., pages 21-24, 1997. 




A |-Approximation Algorithm 
for Metric Max TSP 



Refael Hassin^ and Shlomi Rubinstein^ 



Department of Statistics and Operations Research, 
School of Mathematical Sciences, 

Tel-Aviv University, Tel-Aviv 69978, Israel, 
{hassin, shlomiru}@post .tau.ac.il 



Abstract. We present a randomized approximation algorithm for the 
metric undirected maximum traveling salesman problem. Its ex- 
pected performance guarantee approaches | as n —>■ oo, where n is the 
number of vertices in the graph. 

1 Introduction 

Let G = (V,E) be a complete (undirected) graph with vertex set V, \V\ = n, 
and edge set E. Since we only deal with asymptotic bounds, we assume without 
loss of generality that n is even. For e G if let w(e) > 0 be its weight. For E' C E 
we denote w{E') = w{e). For a random subset if' C E, w{E') denotes 

the expected value. The maximum traveling salesman problem (Max TSP) 
is to compute a Hamiltonian circuit (a tour) with maximum total edge weight. 
If the weights w{e) satisfy the triangle inequality, we call the problem metric 
MAXIMUM TRAVELING SALESMAN PROBLEM. 

We denote the weight of an optimal tour by opt. In 0 a randomized polyno- 
mial algorithm is given for Max TSP that guarantees for any r < || a solution of 
expected weight at least r opt. A paper by Kostochka and Serdyukov contains 
an algorithm with a performance guarantee of | for the metric Max TSP. 

We build on ideas from a |-approximation algorithm for Max TSP by 
Serdyukov in P| and a |-approximation algorithm for the metric case by Kos- 
tochka and Serdyukov 0. We present a randomized approximation algorithm 
for the metric problem with expected performance guarantee that approaches | 
as n — 7> oo. 

A perfect matching is a subgraph in which each vertex in V has a degree of 
exactly 1. A cycle cover, or binary 2-matching, is a subgraph in which each vertex 
in V has a degree of exactly 2. A maximum cycle cower is one with maximum total 
edge weight. The problem of computing a maximum cycle cover is a relaxation 
of Max TSP and therefore the weight of a maximum cycle cover is an upper 
bound on opt. A subtour in this paper is a subgraph with no non-Hamiltonian 
cycles or vertices of degree greater than 2. 
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2 Algorithms by Serdyukov and Kostochka 

Serdyukov’s algorithm for the general Max TSP starts by computing a maxi- 
mum cycle cover C = {Ci, Cg} and a maximum perfect matching M. Then it 
sequentially, for i = 1, s, transfers from Ci to M an edge so that M remains a 
subtour. Finally it completes C into a tour T\ and M into a tour T2 and returns 
the tour with maximum weight between T\ and T2. 

The crucial observation is that it is always possible to transfer an edge from Ci 
to M as required. The performance guarantee follows easily from w{C) > opt and 
w{M) > ^opt. Thus, w{Ti) +w{T2) > \opt and therefore max{w(Ti) , w{T2)} > 
\opt. 

Algorithm OldJVIetric given in Figure E is randomized algorithm which 
resembles an algorithm by Kostochka and Serdyukov |2| for the metric Max 



Old-Metric 

input A complete undirected graph G = {V, E) with weights 
satisfying the triangle inequality. 
returns A tour. 
begin 

Compute a maximum cycle cover C — {Ci, ...,Ca} . 

Delete from each cycle Ci, ..., Cs a minimum weight edge. 

Let Ui and Vi be the ends of the path Pi that results from Ci . 
Give each path a random orientation and form a tour T by 
adding connecting edges between the head of Pi and the 
tail of Pi+i (Pa+i = Pi). 
return The tour T. 
end Old-Metric 



TSP. 



Fig. 1. Old-Metric algorithm 



Since \Ci\ > 3: 

w{ui,Vi) < -w{Ci). 



By the triangle inequality. 



S 



w{T) = Y^w{Pi) 
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3 A New Algorithm 

Our algorithm is motivated by the following idea. The value of a random perfect 
matching in a graph with weights satisfying the triangle inequality is at least half 
the weight of a maximum perfect matching, and consequently at least a quarter 
of the weight of the longest tour. The total weight of a maximum cycle cover, a 
maximum perfect matching, and a random matching is therefore at least ^opt. 
If the corresponding edges could be decomposed into two Hamiltonian cycles 
then the longer of these cycles would have weight of at least \opt. The problem 
with this approach is that these edges may not be decomposable into two tours. 
The algorithm which we propose below contains a modified approach which 
guarantees creation of two tours. 

The proposed algorithm is given in Figure El 



New_Metric 

input A complete undirected graph G = {V, E) with weights 
We e G E satisfying the triangle ineguality. 
returns A tour. 
begin 

Compute a maximum cycle cover C = {Ci, ..., Cs}. 

Compute a maximum perfect matching M . 
for i = 1, ..., s: 

Identify e, f G E C\ Ci such that both M U {e} and 
M U {/} are subtours. 

Randomly choose g € {e, /} (each with probability 1/2). 

A :=C7A{5}. 

M ~ Mu{g}. 

end for 

Complete into a tour Ti as in Algorithm Old_Metric. 

Let S ~set of end nodes of paths in M. 

Compute a random perfect matching Ms over S. 

Delete an edge from each cycle in M U Ms . 

Arbitrarily complete M U Ms into a tour T 2 . 

return The tour T with maximum weight between Ti and T 2 . 

end New_Metric 



Fig. 2. New algorithm for metric Max TSP 
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Theorem 1. The expected weight of the tour returned by Algorithm New-Metric 
satisfies w(T) > (| — 0{^))opt. 

Proof: As before, w{C) > opt and w{M) > opt/2. 

The algorithm selects (sequentially, for i = 1, ..., s) a pair of edges, which we 
call candidates, from each cycle of C and deletes one of them, where the selection 
of the edge to be deleted is with probability 1/2. Let aw(C) be the weight of 
the edges that were candidates for deletion, 0 < a < 1. The expected weight 
of the edges that were actually deleted is {a/2)w{C) . However, as in Algorithm 
Old-Metric, half of this weight is regained when connecting the resulting paths 
to a tour Ti. Hence, 



w{T^) > 



1 - 



a 



a 

41 



w{C) > (l - ^opt. 



( 1 ) 



The algorithm adds to M one of the pair of candidates from each cycle of 
C. The expected weight of the added edges is {a/2)w{C). Note that if a vertex 

V is incident to two candidates then certainly u ^ S'. If v is not incident to a 
candidate then certainly v G S. Finally, if v is incident to one candidate then 

V G S with probability 1/2 (which is the probability that this candidate is not 
chosen) . 

Let |S| = /c + 1. For i G S, exactly one edge from {{i,j)\j G S\i} is chosen 
to Ms- Thus, for an edge (i,j) G if fl (S x S) the probability that this edge 
will be selected to Ms is 1/k. If (i,j) is selected, charge its weight w{i,j) in the 
following manner: Suppose that i is incident to edges e',e" gC. If none of these 
edges was a candidate, charge ru(t, j)/4 to each of e' and e" . If one of them , 
say e! was a candidate, charge w{i,j)/2 to e" (and nothing to e'). Note that 
it cannot be that both e' and e" were candidates since in such a case i ^ S. 
The expected weight charged to an edge {g, h) G C that was not a candidate 
is then (l/A:)[X)reS\s ^(A5)/4 + ^)/4]- Note that the 1/4 factor 

arises also in the case that the vertex, say g, is incident with a candidate edge 
on C, since in such a case g G S with probability 1/2 and then it gets half of 
the weight of the edge of Ms which is incident to it. By the triangle inequality, 
w{r,g) + w{r,h) > w{g,h) so that the above sum is at least w{g,h)/4:. We 
conclude that w{Ms) > w(C)(l — a)/4 and consequently 

w{M U Ms) > ^0.5 + ^ H ^ — 'jopt. 

Finally, the algorithm deletes edges from cycles in M U Ms- We claim that 
I'S'I > n/3. The reason is that the perfect matching M computed by the algorithm 
had all n vertices of V with degree 1. Then, one candidate from each cycle of C 
was added to M. The number of added edges is equal to the number of cycles 
which is at most n/3. Therefore, after the addition of these edges the degrees of 
at most 2n/3 vertices became 2, while at least n/3 vertices remained with degree 
1. The latter vertices are precisely the set S, and this proves that [S’! > n/3. 
Since Ms is a random matching, the probability that an edge of M U Ms is 
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contained in a cycle whose size is smaller than -^/n is at bounded from above by 



1 1 1 ^ y/n / 1 \ 

(The j-th term in the left-hand side of this expression bounds the probability that 
a cycle containing exactly j edge from Ms is created.) Therefore, the expected 
weight of edges deleted in this step is 0{l/y/n)w{M U Ms) and 

»ffi)>(^)(l-0(T))„pi. (2) 

Combining (PJ and (|2I) we get that when a < 1/2, w(Ti) > {7 / 8) opt and 
when a > 1/2 w{T 2 ) > (7/8) (1 — 0{1/ y/n))opt. Thus, 

w{T) = max{u;(ri),u>(T2)} > 
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Abstract. For multi-objective optimization problems, it is meaningful 
to compute a set of solutions covering all possible trade-offs between the 
different objectives. The multi-objective knapsack problem is a general- 
ization of the classical knapsack problem in which each item has sev- 
eral profit values. For this problem, efficient algorithms for computing 
a provably good approximation to the set of all non-dominated feasible 
solutions, the Pareto frontier, are studied. 

For the multi-objective 1-dimensional knapsack problem, a fast fully 
polynomial-time approximation scheme is derived. It is based on a new 
approach to the single-objective knapsack problem using a partition of 
the profit space into intervals of exponentially increasing length. For 
the multi-objective m-dimensional knapsack problem, the first known 
polynomial-time approximation scheme, based on linear programming, 
is presented. 



1 Introduction 

The knapsack problem is a classical problem in combinatorial optimization. In 
the original version of the problem, the input consists of a knapsack with a 
certain capacity and a set of items, each of which has a weight and a profit. 
A feasible solution is a selection of items that can be put into the knapsack, 
i.e., the sum of the weights of the selected items must not exceed the knapsack 
capacity. The goal is to maximize the total profit, i.e., the sum of the profits 
of the selected items. This knapsack problem is also called the 0-1 knapsack 
problem. It is iVP-hard, but it was one of the first problems for which a fully 
polynomial-time approximation scheme was known. 

In spite of its comparatively long history, variants of the knapsack prob- 
lem are still being studied intensively. In this paper, we are interested in the 
multi-objective m-dimensional knapsack problem. This problem generalizes the 
classical knapsack problem in two respects: First, each item has t different prof- 
its, and instead of trying to compute a single solution with maximum profit, 
we want to compute a set of feasible solutions covering all possible trade-offs 
between the different profit values. Second, each item has m different weights, 
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the knapsack has m different capacities, and the knapsack constraints must be 
satisfied for each of the m weights resp. capacities. 

Zitzler and Thiele have presented experimental comparisons of various evolu- 
tionary algorithms for the multi-objective m-dimensional knapsack problem [13]. 
Contrary to their approach, we are interested in polynomial-time algorithms 
with provable approximation guarantees for this problem. We obtain a fully 
polynomial-time approximation scheme for the multi-objective 1-dimensional 
knapsack problem and a polynomial-time approximation scheme for the multi- 
objective m-dimensional knapsack problem. 

Multi-objective optimization problems are frequently encountered in prac- 
tice. Often there are several different criteria measuring the “quality” of a so- 
lution, and it is not possible to select a most important criterion or to combine 
the criteria into a single objective function. In the context of knapsack problems 
consider e.g. a government agency which has to choose a subset out of a given 
list of different projects subject to monetary restrictions. Each project requires 
a certain budget (possibly also human resources of the agency, office space, etc.) 
and yields a certain profit for different objectives such as employment effect, 
infrastructure, side effects for private economy, social effects and public opin- 
ion. In such applications, the decision maker may want to have an algorithm 
to compute a set of good solutions (instead of only one solution) with various 
trade-offs between the different criteria, so that she can select the most desir- 
able solution after inspecting the various alternatives. A discussion of different 
concepts of multi-objective optimization can be found in the recent textbook by 
Ehrgott [1]. 

2 Preliminaries 

2.1 Multi-objective Optimization 

Let / be an instance of an optimization problem with t objectives. If S' is a feasible 
solution, let 14 (S) denote the value of S with respect to the /c-th objective, 
\ < k < t. We assume throughout this paper that all objectives are maximization 
criteria and that the number t of objectives is a constant. 

A feasible solution Si weakly dominates a feasible solution S 2 if 14(Si) > 
Vk{S 2 ) for all 1 < fc < t. We say that a set V of feasible solutions for / is a Pareto 
frontier (sometimes also called Pareto curve) if, for any feasible solution S for /, 
V contains a solution that weakly dominates S. A set of feasible solutions is 
called reduced if it does not contain two different solutions Si and S 2 such that 
Si weakly dominates S 2 . Usually, a Pareto frontier is assumed to be reduced. 

For p > 1, a solution Si is called a p- approximation of a solution S 2 if 
Ufc(Si) > Vk{S 2 )/p for all 1 < A: < t. A set iF of feasible solutions for / is called 
a p- approximation of the Pareto frontier if, for every feasible solution S for I, 
T contains a feasible solution S' that is a p-approximation of S. Note that p 
always satisfies p > 1. The closer p is to 1, the better the approximation of the 
Pareto frontier. 
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An algorithm that runs in polynomial time in the size of the input and that al- 
ways outputs a p-approximation of the Pareto frontier is called a p- approximation 
algorithm. A polynomial-time approximation scheme (PTAS) for the Pareto fron- 
tier is a family of algorithms that contains, for every fixed constant £ > 0, a 
(1 -|- £)-approximation algorithm A^. If the running-time of Ag is polynomial 
in the size of the input and in £~^, the family of algorithms is called a fully 
polynomial-time approximation scheme (FPTAS). 

The Pareto frontier of an instance of a multi-objective optimization problem 
may contain an arbitrarily large number of solutions. To the contrary, for every 
e > 0 there exists a (1 -I- £)-approximation of the Pareto frontier that consists 
of a number of solutions that is polynomial in the size of the instance and 
in £~^ (under reasonable assumptions). An explicit proof for this observation 
has recently been given by Papadimitriou and Yannakakis [9]. Consequently, a 
PTAS or FPTAS for a multi-objective optimization problem does not only have 
the advantage of computing a provably good approximation in polynomial time, 
but also has a good chance of presenting a reasonably small set of solutions to 
the user. 

2.2 Multi-objective Knapsack Problems 

We will consider two types of multi-objective knapsack problems. In the first 
case, the classical 0-1 knapsack problem is simply extended by introducing t 
profit values for every item. Hence, an instance / of the multi- objective knapsack 
problem consists of the following: 

— a number n of items 

— for each item i a weight G N and t profits pki £ N, k = 1, . . . ,t- 

— a knapsack with capacity c G N. 

A feasible solution is a subset S of the given items satisfying the weight constraint 
^ c. 

In the more general multi-objective m- dimensional knapsack problem there 
are m weights for every item. More precisely, an instance of this problem consists 
of the following: 

— a number n of items 

— for each item i there are m weights Wki G N, k = 1, . . . ,m, and t profits 
Pki k = 1, . . . ,t- 

— a knapsack with m capacities G N, A: = 1, . . . , m. 

Here, a feasible solution is a subset S of the given items satisfying all of the 
following m constraints: 

^ Wfc* < Cfc for A: = 1, . . . , m. 
ies 

Throughout this paper, we assume that m is a constant. 

In both cases, the t objective values of a feasible solution S are Vi(S'), . . . , 
Vt{S) with Vk{S) = J2iesPki- 
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3 Related Work 

3.1 Single-objective Knapsack Problems 

The existence of an FPTAS for the classical 0-1 knapsack problem has been 
known right from the beginning of research on approximation algorithms (cf. [8]). 
However, as soon as a second weight constraint is added, thus extending the 
problem to the 2- dimensional knapsack problem, the existence of an FPTAS 
would imply P = NP as shown by Korte and Schrader [3]. Hence, we can 
only hope to develop a PTAS for the multi-objective m-dimensional knapsack 
problem. 

Indeed, for the single-objective m-dimensional knapsack problem a PTAS 
was presented by Frieze and Clarke [2]. 

Theorem 1 (Ftieze and Clarke, 1984). For every constant m, there exists 
a polynomial-time approximation scheme for the m-dimensional 0-1 knapsack 
problem. 



3.2 Multi-objective Knapsack Problems 

An extensive study of multi-objective combinatorial optimization problems was 
carried out by Safer and Orlin [11,12]. In [11], they study necessary and suffi- 
cient conditions for the existence of an FPTAS for a multi-objective optimization 
problem. Their approach is based on the concept of VPP algorithms and VPP 
reductions (VPP stands for value-pseudo-polynomial). It allows to obtain an 
FPTAS for a multi-objective optimization problem either by designing a VPP 
algorithm for it or by using a VPP reduction to a problem for which a VPP 
algorithm has already been found. In [12], they apply these techniques to ob- 
tain FPTASs for various multi-objective network flow, knapsack, and scheduling 
problems. In particular, they prove the existence of an FPTAS for the multi- 
objective knapsack problem, which we study in Section 4. However, it should 
be noted that their FPTAS is obtained using a VPP reduction, which in gen- 
eral does not lead to an FPTAS with a good running-time. In particular, their 
approach involves the solution of many different scaled versions of the given in- 
stance using a VPP algorithm. In [12, p. 26] they write: “Note that the results 
presented are existential. The characteristics of a particular problem must be 
considered carefully in order to develop a practical algorithm for its solution.” 
In Section 4 we develop such an FPTAS that is tailored to the multi-objective 
knapsack problem and provides a better running-time than the FPTAS resulting 
from the existence proof in [12]. 

A detailed study of different dynamic programming approaches for the op- 
timal solution of the multi-objective knapsack problem was recently given by 
Klamroth and Wiecek [6] . A branch and bound algorithm for the bicriteria knap- 
sack problem {t = 2) was recently given by Ehrgott [1]. 

Further results concerning the existence of an FPTAS for a multi-objective 
optimization problem are given by Papadimitriou and Yannakakis [9] . They show 
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that an FPTAS exists if and only if a certain gap problem can be solved effi- 
ciently. For the case of linear objective functions and a feasible set consisting of 
integer vectors satisfying certain conditions, they show that there is an FPTAS if 
there is a pseudopolynomial-time algorithm for the exact version of the problem 
(determining for a single linear objective function and a target value B whether 
a feasible solution with objective value exactly B exists). 

4 An FPTAS for the Multi-objective Knapsack Problem 

The standard procedure to construct an FPTAS for the knapsack problem and its 
relatives is to start with an optimal dynamic programming scheme and transform 
it into an FPTAS by appropriate scaling techniques. Let us therefore begin by 
recalling an optimal dynamic programming scheme for the 0-1 knapsack problem 
and extending it to the multi-objective 0-1 knapsack problem. 

4.1 An Optimal Dynamic Programming Algorithm 

We will use dynamic programming by reaching in the profit space. The dynamic 
programming function W is defined by W[v\ = w and indicates that there exists 
a subset of items with profit v and weight w, where w is minimal among all such 
sets. It is well known that this function can be computed by going through the 
recursion 

W[v + Pi] = min{lF[v + Pi], W[v] + Wi} 

for t = 1, . . . , n after initializing 1F[0] = 0 and lF[u] = c -I- 1 for u > 1. The first 
expression in the minimum denotes the weight of the best possible solution with 
profit V + Pi so far not including item i whereas the second expression packs 
item i into the knapsack. 

To bound the length of the dynamic programming array we need to find 
an upper bound UB on the optimal solution value. This can be done easily, 
e.g. by running the classical greedy algorithm with performance guarantee 1/2. 
Altogether the running time of this straightforward approach is 0{n UB), since 
we try to add every item to every profit value. 

To generalize this algorithm to the multi-objective case we can basically 
expand it to the t-dimensional profit space. First of all we have to compute 
upper bounds on each of the t objective functions. They can be obtained easily, 
e.g. by running the greedy algorithm separately for every objective. Thus, t 
upper bounds UBi, . . . , UBt are computed. For convenience we also introduce 
Climax := max{C/5i, . . . , UBt}. 

Then we define the following dynamic programming function for = 0, 1, 

. . . , UBk and k = 1, ... ,t: 



W[vi,...,vt] = w 4=^ 

there exists a subset of items with profit Vk in the k-th objective for 
k = 1, . . . ,t and weight w, where w is minimal among all such sets. 
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After setting all function values to c -I- 1 in the beginning (indicating that the 
corresponding combination of profits can not be reached yet) except W[0, • . • , 0] 
which is set to 0, the function W can be computed by considering the items 
in turn and performing for every item i the analogous update operation for all 
feasible combinations of profit values (i.e. Vk + Pki < UBk)' 

W[vi+pii,...,vt+ pu] = min{bb[vi +pu,...,Vt+ pu] ,W[vi, . . . ,Vt] + w^} 

As above this operation can be seen as trying to improve the current weight of 
an entry by adding item i to the entry W[vi, . . . ,Vt]- 

It remains to extract the Pareto frontier from W. This can be done in a 
straightforward way: Each entry of W satisfying W[v\, . . . ^Vt] < c corresponds 
to a feasible solution with objective values v\,. . . ,Vt- The set of items leading 
to that solution can be determined easily if standard bookkeeping techniques 
are used. The feasible solutions are collected and the resulting set of solutions 
is reduced by discarding solutions that are weakly dominated by other solutions 
in the set. 

The correctness of this approach follows from classical dynamic programming 
theory. Its running time is given by going through the complete profit space for 
every item which is 0(nrife=i UBk) and hence 0{n{UB^s.^Y)- 

4.2 Developing an FPTAS 

An obvious strategy to develop an FPTAS for the multi-objective knapsack prob- 
lem would be to scale the profits and apply the optimal dynamic programming 
algorithm after scaling, thus adapting the FPTAS for the 0-1 knapsack problem 
to the multi-objective case. However, this approach runs into difficulties. Since 
the error allowed for a (1 -I- £)-approximation is a relative error and hence de- 
pends on the solution value in every objective criterion of any given solution, 
the classical partitioning of the profit space into intervals of equal size does not 
work. Note that usually the set of items is partitioned into items with large 
and small profits. Then the scaled dynamic programming is performed only for 
the large items. The small items are finally added by the greedy algorithm. To 
guarantee that the relative error contributed by the small items to a solution, 
which might have an extremely small profit in one dimension, is not too large, 
the set of small items must be chosen to contain only items with extremely small 
profits in that dimension. But then the set of large items spans a wide profit 
range. Partitioning this range into a bounded number of appropriately small 
equal sized intervals is an impossible task, since the ratio between the smallest 
and largest objective value of one criterion (i.e. the upper bound on the profit 
space) can not be bounded. 

To make the extension of an FPTAS from one to m dimensions possible, we 
require an algorithm which computes a feasible solution with a relative e-error 
for every possible profit value in every objective. In the following we will present 
an FPTAS for the 0-1 knapsack problem fulfilling this particular property and 
briefly show how it can be extended to an FPTAS for the multi-objective knap- 
sack problem. 
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The description of this FPTAS that is derived from the dynamic program- 
ming scheme at the beginning of Section 4. 1 is relatively simple. The profit space 
between 1 and UB is partitioned into u intervals 

[l,(l + e)s), [(i + e)i,(i + e)^), [(i + e)S,(i + £)S), . . . , + (1 + e)-) 

with u := |"nlogi_|_£ UB~\ . Note that u is of order 0(l/£ • n log UB) and hence 
polynomial in the length of the encoded input. The main point of this construc- 
tion is to guarantee that in every interval the upper endpoint is exactly (l-l-e)" 
times the lower endpoint. 

To achieve an FPTAS we adapt the above dynamic programming algorithm 
to the partitioned profit space. Instead of considering function W for all integer 
profit values, we consider only the value 0 and the u lower endpoints of intervals 
as possible profit values. The definition of the resulting dynamic programming 
function W is slightly different from W. An entry W[v\ = w, where v is the 
lower endpoint of an interval of the partitioned profit space, indicates that there 
exists a subset of items with weight w and a profit of at least v. 

The update operation where item i is added to every entry W[v\ is modified 
in the following way. We compute the profit attained from adding item i to the 
current entry. The resulting value v + pi is rounded down to the nearest lower 
endpoint of an interval u. The weight in the corresponding dynamic programming 
function entry is compared to W[v\+Wi. The minimum of the two values is stored 
as (possibly updated) function value W[u\. 

The running time of this approach is bounded by 0(nu) and hence by 0(l/e- 
log UB) . 

Theorem 2. The above algorithm computes a (1 -I- e) -approximation for every 
reachable profit value of a 0-1 knapsack problem. 

Proof. The correctness of the statement is shown by induction over the set of 
items. In particular, we will show the following claim. 

Claim: After performing all update operations with items l,...,i for 
some i G {!,..., n} for the optimal function W and the approximate 
function W there exists for every entry W[v\ an entry W[v] with 

(i) W[v\ < W[v\ and (ii) (1 -I- e) " h > u. 

Evaluating (ii) for i = n immediately yields the desired result. 

To prove the Claim for z = 1 we add item 1 into the empty knapsack. Hence 
we get an update of the optimal function with IF[pi] = Wi and of the approx- 
imate function with W[v\ = W\ where v is the largest interval endpoint not 
exceeding p\. Property (i) holds with equality, whereas (ii) follows from the fact 
that Pi and v are in the same interval and hence s)^ v > pi. 

Assuming the Claim to be true for z — 1 we can show the properties for z 
by investigating the situation after trying to add item i to every function entry. 
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Clearly, those entries of W which were not updated by item i fulfill the claim by 
the induction hypothesis since (ii) holds for i — 1 and even more so for i. 

Now let us consider an entry of W which was actually updated, i.e. item i 
was added to W[v] yielding W[v+pi\. Since the Claim is true before considering 
item i, there exists W[v] with (1 -b e)^w > u, i.e., v > v / {1 + e)^ . 

In the FPTAS v + Pi is computed and rounded down to some lower interval 
endpoint u. From the interval construction we have (l-be)" u > v + pi. Putting 
things together this yields 

(1 -b e) " ft > v/(l -b e)“ + Pi > {v + Pi) / {I + e)~ . 

Moving terms around, this proves that u satisfies (ii) for v + Pi. Property (i) 
follows immediately by applying the claim for W[v\ since 

W[u] = min{lP[{(], IF[{;] -b Wi} < W[v\ + Wi = W[v + Pi\. □ 

The main point of developing the above FPTAS was the approximation of every 
reachable profit value and not only the optimal solution value, since this will 
allow us to extend the FPTAS to the multi-objective case. In itself, this new 
FPTAS for the single-objective 0-1 knapsack problem does not improve on the 
previously best known FPTASs. For the 0-1 knapsack problem in general, the 
best known FPTAS is due to Kellerer and Pferschy [4,5]. For the special case 
where n < 1/e, the currently best FPTAS by Magazine and Oguz [7] runs in 
asymptotic time of O (ji^ log n-l/e). Depending on the relationship between log n 
and log UB for a given family of instances, our new FPTAS may be preferable. 

The space requirement of our new FPTAS is 0{n + u), i.e. 0(nlog UB ■ 1/e), 
after avoiding the explicit storage of subsets for every dynamic programming 
entry (see [10]). This is higher than the 0{n ■ 1/e) required by [7]. 

It should be noted that the performance of this new FPTAS for the single- 
objective knapsack problem can be further improved for the general case of 
n > 1/e by introducing the usual partitioning of items into small items (pt < 
eUB/2) and large items {pi > eUB/2). Applying the interval structure to the 
set of large items we can start with an interval [e 175/2, eUB/2 • (1 -b e)«) and 
continue as above to generate further intervals by multiplying with (1 -be)". In 
this case, the number of intervals is of order 0(l/e • n log(l/e)) since the value 
of UB cancels out in the computation. Hence, the total running time would be 
0(n log n -b 1/e • log(l/e)). 

Furthermore, we can increase the range of the intervals by replacing the 
multiplier (1 -b e) " by (1 -b e)^. Recall that in the proof of Theorem 2 the value 
^ was only necessary because every entry of the dynamic programming function 
may correspond to a sequence of up to n items. However, dealing only with 
large items, there can be at most 0(\/e) items contributing to any dynamic 
programming entry. This reduces the number of intervals to 0(l/e^ log(l/£)). 

A further improvement is attained by using the reduction of the set of 
large items given in [4] as a preprocessing step. It follows that only 1/e^ large 
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items need to be considered at all. Performing an update operation for each of 
them with the reduced number of intervals yields an improved running time of 
0{n\ogn+l/e^\og{l/e)) with 0(n + 1/e^ log(l/£)) space. 

The extension of this FPTAS to the multi-objective problem can now be per- 
formed in a completely analogous way as for the optimal dynamic programming 
scheme of Section 4.1. The partitioning of the profit space is done for every 
dimension, which yields Uk ■= |"nlog]^_i_£ UBk] intervals for every objective k. 
Instead of adding an item i to all lower interval endpoints v as in the one- 
dimensional case we now have to add it to every possible t-tuple {vi, . . . ,Vt), 
where Vk is either 0 or a lower interval endpoint of the fc-th profit space parti- 
tioning. The resulting objective values wi +pu, . . . ,vt+pti are all rounded down 
for every objective to the nearest interval endpoint. At the resulting t-tuple of 
lower interval endpoints a comparison to the previous dynamic programming 
function entry is performed. 

Since the “projection” of this algorithm to a single objective yields exactly 
the FPTAS for the 0-1 knapsack problem as discussed above, the correctness of 
this method follows from Theorem 2. The running time of this approach is clearly 
bounded by 0(n 01=1 hence 0(n(l/e • n log UBra&xY)- Summarizing, 

we have shown the following statement. 

Theorem 3. For every constant t, the above algorithm is an FPTAS for the 
multi- objective knapsack problem with t objectives. 

5 A Polynomial-Time Approximation Scheme 

In this section, we present a polynomial-time approximation scheme for the 
multi-objective m-dimensional knapsack problem. Recall that t denotes the num- 
ber of objective functions and that we assume that both m and t are constants. 
We make use of some of the ideas from the PTAS for the single-objective m- 
dimensional knapsack problem due to Frieze and Clarke [2] . 

Let £ > 0 be given. Choose d > 0 and > 0 to be positive constants such 
that (1 -I- (i)(l -I- < 1 -I- £. For example, set <5 = /x = min{l/3, £/3}. 

Let an instance / of the multi-objective m-dimensional knapsack problem 
be given. First, the algorithm computes for each fc, 1 < fc < f, a (1 -I- 5)- 
approximation Uk of the m-dimensional 0-1 knapsack problem with objective pk* 
(i.e., with Pi = pki for all items i), using Theorem 1. Note that any feasible so- 
lution S to I satisfies 0 < 14(5) < (1 -I- S)Vk{Uk) for all 1 < fc < t. Therefore, 
the set of all objective vectors of all feasible solutions to I is contained in 

U = [0, (1 + d)Vi{Ui)] X [0, (1 + 5)V2{U2)] X • • • X [0, (1 + 5)Vt{Ut)]. 

Define Uk = |"logi_|_, 5 14(C^fe)l • With respect to the k-th objective, we consider 
the following (lower) bounds for objective values: 

— Mfc -I- 1 lower bounds £kr = (1 + for 1 < r < -|- 1, and 

— the lower bound £ko = 0. 
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The algorithm enumerates all tuples (ri, r 2 , . . . , rt-i) of t — 1 integers satisfying 
0 < Tfc < Ufe + 1 for all I < k < t. Note that the number of such tuples is 
polynomial in the size of the input if t and S are constants. Roughly speaking, 
the algorithm considers, for each tuple (ri, r 2 , . • . , rt-i), the subspace of solutions 
S satisfying Vk(S) > for all 1 < fc < t, and tries to maximize V't(S') among 
all such solutions. 

For this purpose, the algorithm proceeds as follows. Let h= — 1)(H- 

/x) //x] . Let Ai, A 2 , . . . , Athe subsets of {1, 2, . . . , n} with cardinality at most h. 
The algorithm enumerates all possibilities for choosing such sets Ai, A 2 , . . . , Af. 
Intuitively, the set Ak represents the selected items of highest profit with respect 
to pfc*. Let A = U A 2 U • • • U At. For 1 < fc < t, let 

Fk = {j e {1,2, . . . ,n} \ Afc I pkj > minjpfc, | q G At,}}. 

Intuitively, Ffc is the set of all items that cannot be put into the knapsack if 
Ak are the items with highest profit with respect to pk* in the solution. Let 
F = Fi U F 2 U ■ ■ ■ U Fm- li F C\ A ^ the current choice of the sets Ak is not 
consistent, and the algorithm continues with the next possibility of choosing the 
sets Ai, A 2 , . . . , At. 

So let us assume that Ffl A = 0. Consider the following linear programming 
relaxation: 



(LP) max E PtjXj 
i=i 

71 

s.t. WkjXj < Ck, for fc = 1, 2, . . . , TO 
i=i 

71 

^ ^ PkjXj ^ 7 fol" k — 1,2, ...,t 1 

i=i 

Xj = 0, j G F 
Xj = 1, j G A 

0 <Xj<l, j G (1,2, ...,n}\(F’UA) 

Let X be an optimal (basic feasible) solution vector for (LP). Such a vector x can 
be computed in polynomial time, if it exists. (If no such vector exists, proceed to 
the next combination of sets Ai,. . . ,At.) As (LP) has only t + m — 1 non-trivial 
constraints, at most t + m—1 components of x are fractional. Now an integral 
vector X is obtained by rounding down x, i.e., we set xj = [aijj . From the integral 
vector X, we obtain a solution S to the multi-objective m-dimensional knapsack 
problem by letting S' = (j G (1, 2, . . . , n} | Xj = 1}. 

For each possibility of choosing Ai, A2, . . . , At consistently, we either find 
that (LP) has no solution, or we compute an optimal fractional solution to (LP) 
and obtain a rounded integral solution S. We output all integral solutions that 
are obtained in this way (if any). The procedure is repeated for every tuple 
{ri,T2, . ■ ■ , Tt-i). This completes the description of the algorithm. 
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Theorem 4. For every pair of constants m and t, the above algorithm is a 
polynomial-time approximation scheme for the t-objective m-dimensional knap- 
sack problem. 



Proof. We show that the algorithm described above is a (1 + £)-approximation 
algorithm for the multi-objective m-dimensional knapsack problem. 

First, we argue that the running-time is polynomial. Initially, the (1 -I- S)- 
approximation algorithm for the single-objective m-dimensional 0-1 knapsack 
problem is called t times. This gives the solutions U\,. . . ,Ut and the numbers 
Ui,. . . ,Ut. Then our algorithm enumerates all tuples of t — 1 numbers ri,. . . ,rt-i 
satisfying 0 < -|- 1 for all 1 < /c < t — 1. There are 0(uiU2 ■ ■ ■ Ut-i) 

such tuples. As each Ui is polynomial in the size of the input, the total num- 
ber of tuples is also bounded by a polynomial. For each tuple, all combinations 
of choosing sets Ai,. . . ,At of cardinality at most h are enumerated. There are 
Q(.^t(i-i-m-i)(i-i-/i)//x) combinations. For each combination, the linear pro- 
gram (LP) is created and solved in polynomial time. Since m, t, 5 and p, are 
constants, the overall running-time is polynomial in the size of the input. 

Now, we analyze the approximation ratio. Consider an arbitrary feasible so- 
lution G. We have to show that the output of the algorithm contains a so- 
lution S that is a (1 -I- e)-approximation of G. For 1 < fc < t — 1, define 
Cfe = max{r | £kr < Vk{G)}. It follows that £krk < Vk{G) < tkrk{^ + d). For 
1 < k < t, let Ak be a set that contains min{/i, |G|} items in G with largest 
profit pk*. When the algorithm considers the tuple (ri, r 2 , . . . , rt-i) and the sets 
Ai,. . . ,At, the linear program (LP) is feasible, because G constitutes a feasible 
solution. Therefore, the algorithm obtains a fractional solution Sf and outputs 
a rounded integral solution S for this tuple (ri, r 2 , . . . , rt-i) and for this combi- 
nation of sets Ai,. . . ,Af. If |G| < h, the rounded solution S is at least as good 
as G with respect to all t objectives, because it contains G as a subset. Thus, 
the solution S output by the algorithm weakly dominates G in this case. 

Now assume that |G| > h. Consider objective k. We have that Vk{Sf) > 
Vfc(G)/(l -I- (5). For 1 < k < t, this is because Sf is a feasible solution to (LP). 
For k = t, we even have Vt{Sf) > Vt{G), because Sf is an optimal fractional 
solution to (LP). 

Furthermore, we claim that Vk{S) > Vk{Sf)/{l + p). Consider some objec- 
tive k, 1 < k < t. When S is obtained from A/, at most t-\-m—l items are lost, 
because S f has at most t -\-m — 1 fractional items, as we noted above. Let p be 
the smallest profit among the h items with highest profit in G with respect to 
objective k. We have Vk{Sf) > hp and 14(5') > Vk{Sf) — {t-\-m—l)p. Combining 
these two inequalities, we get 

kjc(5) > Vk(Sf) - > Vk(Sf) - ^ + V fc(5y) 

= Vk{Sf){l - ^ + > Vk{Sf){l - ^) = Vk{Sf)/{l + p). 
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So we have Vk{S) > Vk{Sf)/{l + /t) > Vfe(G)/((l -I- ^)(1 -I- 15)). Since we have 
chosen fj, and S such that (1 + fi){l + <5) < (1 -b e), we obtain that S' is a (1 -b e)- 
approximation of G. 

As the above argument is valid for any feasible solution G, we have shown 
that the set of solutions output by our algorithm is indeed a (l-be)-approximation 
of the Pareto frontier. □ 
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Abstract. Methods for ranking World Wide Web resources according to 
their position in the link structure of the Web are receiving considerable 
attention, because they provide the first effective means for search en- 
gines to cope with the explosive growth and diversification of the Web. 
We show that layouts for effective visualization of an underlying link 
structure can be computed in sync with the iterative computation uti- 
lized in all popular such rankings. Our visualizations provide valuable 
insight into the link structure and the ranking mechanism alike. There- 
fore, they are useful for the analysis of query results, maintenance of 
search engines, and evaluation of Web graph models. 



1 Introduction 

The directed graph induced by the hyperlink structure of the Web has been 
recognized as a rich source of information. Understanding and exploiting this 
structure has a proven potential to help dealing with the explosive growth and 
diversification of the Web. Probably the most widely recognized example of this 
kind is the PageRank index employed by the Google search engine [6] . 

PageRank is but one of many models and algorithms to rank Web resources 
according to their position in a link structure (see, e.g., [25,20,9,1,5,8]). Our goal 
is to supplement rankings with a meaningful visualization of the graph they are 
computed on. 

While graph visualization is an active area of research as well [10,19], its 
integration with quantitative analysis is only beginning to receive attention. 
It is, however, rather difficult to understand the determinants of a particular 
ranking if a visualization is prepared without explicitly taking it into account. 

A design for graph visualizations showing vertex prominence in the structural 
context is introduced in [4], where the vertical dimension of the layout space is 
reserved to represent exactly the prominence of each vertex, but since horizontal 
layout is done by an adaptation of the Sugiyama framework for layered graph 
drawing [27] it does not scale to graphs with more than a few hundred vertices. 

In the present application, it is also highly desirable that dense subgraphs 
be clustered, since on the Web they typically correspond to related resources. 
A well-known class of graph layout algorithms that exhibit these properties are 
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spring embedders, and the range for which they are practical has recently been 
extended to sparse graphs with several thousands of vertices [13,15,29]. Since 
they require relatively complex memory management, however, they are less 
suitable in our situation. 

Instead, we apply a spectral graph layout approach, because of its formal 
and computational similarities with common ranking techniques. Background 
on link-based ranking is given in Sect. 2. The actual layout definition and com- 
putation are described in Sect. 3. In Sect. 4, we discuss applications and provide 
examples on generated and real-world data. 



2 Structural Ranking of Web Resources 

The structural features of the Web are captured in a directed graph G = {V, E), 
where the set V of vertices represents the set of resources on the Web, and there 
is a directed edge (u, v) € E from a resource u to a resource v, if u contains a 
hyperlink to v. All graphs considered in this paper are assumed to be connected. 
We do not allow parallel edges, but loops and a positive real weight uJuv for 
every edge. Let A{G) = A = be the adjacency matrix of a graph, 

i.e. Auv = UJuv if (u,v) G E, and = 0 otherwise. The indegree {outdegree), 
d+ (d-), of a vertex v G R is E„:(u.«)gb iT,w.{v,w)€E ^vw)- 

We will frequently use graph-related matrices, such as the adjacency matrix, 
and their spectra, i.e. the set of eigenvalues and associated eigenvectors. For 
background on matrix computations see, e.g., [14]. 

Any real-valued vector p = (pv)y^y defined on the vertices of a graph is 
called a prominence index, where py is the prominence of vertex v. A ranking 
is obtained from a prominence index by ordering the vertices according to non- 
increasing prominence. 

Many models have been proposed to capture an explicitly or implicitly de- 
fined notion of a vertex’s prominence in a graph [18,17,11,3,12,25,20,1,9,8, and 
many more]. Though in general only defined for undirected graphs, we first 
outline eigenvector centrality [2], because it nicely illustrates some important 
commonalities of the three popular rankings that we discuss below. 

Assume that the prominence of a vertex is understood to be proportional 
to the combined prominence of its neighbors, \py = v € V, 

where the constant A is introduced so that the system of equations has a non- 
zero solution. This definition yields the eigensystem of the transposed adjacency 
matrix. 



Xp = Ap, {eigenvector centrality) 

and every eigenvector of A gives a ranking of the vertices for the above notion 
of prominence, though the principal eigenvector, i.e. the one associated with 
the eigenvalue of largest magnitude, is generally preferred [3,12]. The principal 
eigenvector can be obtained by power iteration, which starts with any non- 
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Fig. 1. Prominence indices on a directed grid 

zero vector and iteratively multiplies the matrix with the current solution, e.g. 
pio) X and 

p(k+i) ^ A. p{k)_ 

Since the matrices considered here originate from large and sparse graphs, multi- 
plication is carried out by computing t— J2u-{u v}^eP^^ every v € V. 

In this and in the iterations to follow, we tacitly assume that each vector is 
normalized (e.g., to the length of the starting vector) before the next vector is 
computed. This serves to avoid numerical difficulties with numbers growing out 
of range, and does not affect the relative prominence of vertices. 

More elaborate indices defined on directed graphs are discussed below. In 
Fig. I they are illustrated on an acyclic grid. The grid is placed in a plane and 
each grid point is then lifted according to its prominence. 

Hubs and authorities [20]. A natural notion of prominence for a Web resource is 
the extent to which it is referred to by other Web pages, in particular by those 
pages that specialize in listing useful resources. In turn, the property of being 
such a list of useful resources is a notion of prominence in itself. In these com- 
plementary and mutually reinforcing notions prominent resources are called au- 
thorities (resources with useful information) and hubs (pages with useful links). 

The hub score of a page is proportional to the combined authority of the 
resources it links to, and the authority of a resource is proportional to the com- 
bined hub score of the pages linking to it. In practice, hub and authority scores 
are thus computed by iterating ^ 1 and 

p(2fc+l)^^T.p(2fc) 
p(2k-e2) ^ ^.p(2/c+l)_ 



For ^(fc) _ p( 2 k-\-i)^ alternating iteration can be written as 

/j(fc+i) ^ aA'^ ■ {hubs) 

^ a"’" A ■ {authorities) 
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In this formulation, it is easy to see that the hub and authority indices in a graph 
with adjacency matrix A correspond to eigenvector centrality in the weighted 
undirected graphs with adjacency matrix AA^ and A^A, respectively. 

As can be seen in Fig. 1(a), vertices on and above the falling diagonal of the 
grid have the highest authority, because they are in the midst of the undirected 
graph induced by A^A. Compare this to the undirected graph induced by AA"^ , 
indicating why the best hubs are found on and below this diagonal. 

PageRank [5]. In another variant of eigenvector centrality the contribution of 
each vertex to another vertex’s prominence is weighted by its outdegree, = 
^u-(uv)eE [24,7]). If we require p to be a probability distribution 

over the set of vertices, this notion has a nice interpretation as the stationary 
distribution of the simple random walk on the graph (or random surfer on the 
Web, if you will), in which each edge leaving a vertex is chosen with equal 
probability. 

Let M = DA A be the adjacency matrix normalized so that the rows sum to 
one, where D_ is the diagonal matrix with the outdegrees on the diagonal. Then, 
M is a stochastic matrix of transition probabilities, and a stationary distribution 
p = • p satisfies the above notion of prominence. However, if a vertex has 

outdegree zero, the computation breaks down, and strongly connected compo- 
nents may cause an overdue increase of the prominence of their vertices. This 
so-called “sink problem” can be avoided by introducing an escape mechanism. 
Let p be an a-priori probability distribution over the vertices (e.g., user prefer- 
ences or general popularity of a resource), then with probability lo the random 
walk picks an edge of the graph whereas with the remaining probability, it jumps 
to any other vertex according to p. The index is thus defined by 

p = loM'^p -|- (1 — uj)p {PageRank) 

= {ujM'^ + {I - oj)p ■ A) -p. 

The second equality holds because p is a probability distribution. From the 
second expression it can be seen that PageRank is the eigenvector centrality of 
a weighted graph with a complete set of additional escape edges. This modified 
matrix is irreducible and aperiodic so that the iteration p^^^ •«— 4 i and 

p(fc+i) ^ {^M +{1- uj)l ■ ■ p^'^) 

converges to a unique prominence vector. On the grid in Fig. 1(b), the random 
surfer may jump to any vertex, but is most likely to walk towards the upper and 
right side of the grid, from where the only continuation is towards the upper 
right corner. 

Katz’s status index [18]. As a generalization of simply using indegrees to measure 
‘status’ in social networks, the prominence of a vertex is determined by the 
number of directed paths of arbitrary length ending in the vertex, where the 
influence of longer paths is attenuated by a decay factor. Recall that the entries of 
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the /c-th power of the adjacency matrix of an unweighted graph give the number 
of paths of length k between every pair of vertices. Therefore, this notion of 
prominence is determined by 



P = 




{Katz’s status) 



where parameter a corresponds to the fraction of status that is passed along a 
directed edge. For sufficiently small values of a (a convenient choice is 37 ^, 
where is the maximum indegree of any vertex in the graph), the sum con- 
verges to (/ — aA^)~^ — I. Therefore, the status vector can be obtained by 
solving • p = Solving this system of linear equations directly 

is prohibitive for large graphs. Standard sparse matrix approaches approximate 
a solution iteratively. The update step in Jacobi iteration, for instance, yields 
p{k+i) ^ . p{k) _|_ This iteration nicely reflects the underlying notion 

of adding contributions from vertices farther and farther away. The same can be 
observed in Fig. 1(c), where the attenuated influence from vertices in the lower 
left does not suffice to discriminate the prominence of vertices in the upper right 
any more. 



In a sense, the above definitions of prominence are contained in the following 
generic formulation of status in networks [17]. It puts a twist on eigenvector 
centrality through the addition of an a-priori prominence vector p, 

p = A^p + p. {Hubbel’s status) 

By choosing appropriate weights and a-priori prominences, we obtain eigenvector 
centrality and PageRank. Reordering, we have p = {I — A^)~^ ■ p, provided the 
inverse exists. If it does, it equals therefore p = ’ 

p = i^I + YAk=A^)^) ■ P- With uniform edge weights and p = 1 we obtain a 
prominence index in which every component is by one larger than Katz’s status 
index. 



3 Spectral Graph Layout 

In the previous section we emphasized formal similarities in the definition of 
several popular prominence indices. In practice, all of them are computed by 
some variant of sparse matrix power iteration, i.e. by iterating over all vertices, 
and, for each vertex, combining the current scores of its neighbors. For really 
large graphs, most of the running time is spent on data transfer between internal 
and external memory. It is thus desirable not to cause additional swapping. 

In this section, we introduce a layout algorithm that produces meaningful 
layouts while synchronously operating on the same data that prominence im- 
plementations do, because it is based on the same principles as the ranking 
algorithms. It is therefore a simple matter to augment an existing system for 
ranking resources to compute a layout of the graph on the fly. 
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Layout with eigenvectors. For the layout, we consider the undirected, simple 
graph obtained by omitting directions, loops, and multiple edges. Recall that 
directions are already represented in the prominence dimension. Let A be the 
adjacency matrix of the skeleton and D = = D~ its diagonal degree matrix. 

We consider the Laplacian matrix L = D — A, which has interesting applications 
in many areas (see, e.g., [23]). This matrix is interesting for graph layout, since 
minimizing the associated quadratic form 

X Lx ^ ] {Xu Xy^ , 

{u,v}gE 

corresponds to minimizing the squared distance between pairs of adjacent ver- 
tices, if X is interpreted as a vector of vertex positions. This corresponds to spring 
embedding with zero-length springs and no repelling forces. Clearly, the minima 
are obtained by assigning the same position to all vertices (recall that we assume 
connectedness of the graph) . 

These undesirable solutions can be avoided by fixing some selected vertices 
at distinct positions. Minimization subject to these boundary conditions is the 
famous hary centric layout model of Tutte [28] . However, it requires some a-priori 
knowledge about which vertices should be fixed where. 

Note that the undesired minima x = cl are the eigenvectors associated with 
eigenvalue zero, i.e. Lx = 0. More generally, if (A,x) is any eigenpair of L, then 
A = ^ . We therefore want to minimize 

X 

x"^ Lx 

— Tf . — subject to a; T 1, 
x^ x 

since the eigenvectors of a symmetric matrix are orthogonal. Hence, the desired 
solution is the normalized eigenvector associated with the second-smallest eigen- 
value of L. This vector is called the Fiedler vector and, because of its distance 
minimization property, often used in graph partitioning (see, e.g., [26]). For the 
same reason, it yields a useful one-dimensional layout of a graph, because edges 
are short and hence dense subgraphs are clustered. If a rank visualization in three 
dimensions is desired (cf. Fig. 1), a reasonable choice for the second free dimen- 
sion is the normalized eigenvector minimizing the objective function subject to 
being orthogonal to 1 and the first solution. 

An example of two-dimensional layouts obtained by a typical spring embed- 
der and two eigenvectors of L is given in Fig. 2. While the spring embedder 
produces more uniform edge lengths, the eigenvectors emphasize structural clus- 
tering of vertices. 

Computing the eigenvectors. Eigenvectors associated with the smallest eigen- 
values of large sparse matrices are usually computed using Lanczos’ method. 
However, all popular prominence indices are computed using a variant of the 
much simpler power iteration, which only gives an eigenvector associated with 
the eigenvalue of largest magnitude. Since we want to synchronize layout and 
prominence computation, we consider the matrix L' = 2A ■ 1 — L instead of L 
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(a) spring embedder 



(b) spectral layout 



Fig. 2. Two-dimensional layouts of a random planar triconnected graph 



itself, where A is the maximum degree of any vertex. The crucial observation is 
that the eigenvectors of L and L' are identical, but the order of the magnitudes 
of their corresponding eigenvalues is reversed. 

Straightforward application of power iteration on L' returns the principal 
eigenvector of L', which is the trivial eigenvector of L. Power iteration on 
a vector that is orthogonal to the principal eigenvector yields an eigenvector 
of the second-largest eigenvalue of L', and hence the desired layout for the first 
dimension. Iterating on a vector that is orthogonal to both the trivial eigenvector 
and the approximate solution for the first dimension yields the second dimension. 

A vector y is orthogonalized with respect to another vector x by setting 
y y — X. Orthogonalization with respect to the trivial eigenvector is 
even easier, since it corresponds to subtracting, from each entry of y, the mean 
of all its entries. To obtain vectors x and y for a two-dimensional layout we thus 
carry out the following augmented power iteration on random starting vectors 
a;(o)^ y(o) repeatedly orthogonalized with respect to 1 and to one another 



a.(fc+i) ^ L' . a-W. ^(k+i) ^ ^(k+i) _ i ^fe+i) 

v^V 

n 

v€V 






(k+l)T y{k+l) 

z y Ak+i) 

2.(fc+l)T . ^(k+1) 



Intuitively, the layout is centered, rectified, and (due to the tacitly assumed 
normalization) zoomed after each multiplication with L' . The last two lines are 
omitted if only one dimension needs to be determined for the layout. Though 
convergence may be slower than in the prominence computations, a few iterations 
are usually sufficient for a reasonable layout. 
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4 Application to Web Graph Models and Query Results 

We demonstrate our visualization approach on two different kinds of data, ran- 
dom Web graphs generated according to slightly modified versions of the evolving 
copying models of [21] and the small-world model of [30], and a real-world ex- 
ample obtained from an AltaVista query. Our C-|— I— implementations use the 
Library of Efficient Data Types and Algorithms (LED A) [22]. 

Web graph models. In the linear growth model, a graph grows one vertex at a 
time. At each time step, a prototype is chosen among already existing vertices, 
and a new vertex is generated. This new vertex is then assigned a fixed number 
of outgoing edges. With some fixed probability, the ith of these edges points to 
a randomly selected vertex among those already existing (creation case), and 
with the remaining probability it points to the same vertex as the ith outgoing 
edge of the prototype vertex (copying case). Our generator does not introduce 
multiple edges, and if a prototype happens to not have enough outgoing edges, 
no edge is introduced in the copying case. Clearly, all graphs evolving like this 
are acyclic. 

In the exponential growth model, a graph grows by a fixed fraction of its cur- 
rent size at each time step. New vertices receive a fixed number of loops, and for 
each already existing edge, its target receives a new incoming edge for which, 
with some fixed probability, the source is chosen uniformly at random from the 
new vertices, and otherwise from the existing vertices with probability propor- 
tional to their current outdegree. We used a simpler model in which existing 
vertices are chosen uniformly at random as well. 

In the small-world model, we initially generate a cyclic sequence of vertices 
and let a vertex link to a fixed number of predecessors and successors. Then, 
each edge is rewired with some small probability by chosing a new destination 
uniformly at random. 

Figure 3 shows spectral layouts of graphs generated according to these models 
and rankings replacing the vertical dimension with PageRank as an example of a 
prominence index. There is no visible clustering in the evolving copying models. 
Moreover, the prominence of resources appears to be correlated with their age 
(also with the other indices outlined in Sect. 2). The figures thus graphically 
support the conclusion of [21] that death processes, i.e. the occasional deletion 
of vertices and edges, might be necessary for the evolving copying models to be 
realistic. In the small- world model, the spectral layout reveals a cycle crumpled 
by chords, and the ranking shows that the model yields a rather egalitarian 
structure. 

Query results. The data for this example was compiled in a way similar to the 
HITS algorithm [20] . We asked the AltaVista search engine for pages containing 
the word “java” and used the first 200 URLs it returned as the root set. It 
was then expanded by asking AltaVista for pages containing links to resources 
in the root set (backward extension), and adding resources linked to by pages 
in the root set (forward extension). The graph was completed by adding edges 
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(a) Linear growth evolving copying model [21] 




(b) Exponential growth evolving copying model [21] 




(c) Small-world model [30] 



Fig. 3. Web graph models (2D spectral layout and ID spectral layout with PageRank) 
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between pages in the resulting set of vertices. The computations were carried out 
on the only large component of this graph from which some poorly connected 
vertices were removed to prevent extreme clustering. The graph has more than 
5000 vertices and 15000 edges. 

In Fig. 4, this graph is shown twice, with vertices positioned vertically accord- 
ing to the Fiedler vector, and horizontally according to one of two prominence 
indices. Again, links to less prominent resources are colored red. 

The most prominent resources match the expectations, but there are some 
less expected recommendations as well. It is clearly visible that some of these 
serve distinct user groups, like the Japanese directory in the upper right. Note 
that we may not conclude that vertically close vertices are closely connected 
without zooming into the image. However, it is safe to assume that vertically 
separated vertices are likely to be distant in the structure. This feature can serve 
to distinguish query results which contain a keyword that is used in different 
contexts (see the “jaguar” -query example in [20]). 

Figure 4 also shows that the top authorities are surprisingly distinguished 
from the rest of the graph, and quite different from our expectations. Most of 
them are located at Stars.com, a large repository for developers (“Web Devel- 
oper’s Virtual Library”). Since they are well connected among each other, it is 
by virtue of our layout approach that their vertical position is similar, and thus 
this phenomenon could be detected by visual exploration. In the figure, resources 
at this site are colored yellow. Not surprisingly, vertices with high hub scores are 
from this site as well. This simple example graphically explains why the original 
HITS algorithm does not consider links within a site. 

5 Conclusions 

We have proposed a method for Web graph visualization that provides unam- 
biguous identification of prominent resources while showing the entire graph and 
its clustering. The layout of our visualizations can be computed synchronously 
with common link-based rankings. Speed-up techniques such as in [16] that re- 
organize storage to reduce external memory access therefore directly apply to 
the layout algorithm as well. 

We expect our visualizations to be particularly useful for visual inspection of 
rankings, for teaching and communicating ranking procedures, and for evaluation 
and illustration of stochastic models of the Web graph. The main advantage of 
spectral graph layout, the correspondence with distance minimization and hence 
with clustering, becomes a drawback in cases where the underlying undirected 
graph is poorly connected, since denser subgraphs will be clustered in a very 
small interval. This problem is addressed in the full version of this paper, which 
also contains an analysis of the convergence properties of the iterative compu- 
tation. 

Acknowledgments. We thank Marco Gaertler for collecting the “java”-query 
data used in Sect. 4, and Stephen Muth for discussions regarding the iterative 
computation of Katz’s status index. 
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Fig. 4. Authority and PageRank visualization of “java” query result 
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Abstract. In this paper we introduce a new drawing style of a plane 
graph G, called proper box rectangular (PBR) drawing. It is defined to 
be a drawing of G such that every vertex is drawn as a rectangle, called 
a box, each edge is drawn as either a horizontal or a vertical line seg- 
ment, and each face is drawn as a rectangle. We establish necessary and 
sufficient conditions for G to have a PBR drawing. We also give a simple 
linear time algorithm for finding such drawings. The PBR drawing is 
closely related to the box rectangular (BR) drawing defined by Rahman, 
Nakano and Nishizeki HH. Our method can be adapted to provide a new 
algorithm for solving the BR drawing problem. 



1 Introduction 



The problem of “nicely” drawing a graph G has received increasing attention 
0. Such drawings are useful in visualizing planar graphs and find applications 
in fields such as computer graphics, VLSI layout, algorithm animation and so 
on. Among different drawing styles, the orthogonal drawing has attracted much 
attention due to its applications in circuit layouts, database diagrams, entity- 
relationship diagrams etc. |dl7ll0ll4ll8ll9l2()l22| . A survey of this area was given 
in im The definitions and examples given in this section are from m 

In an orthogonal drawing of a plane graph G, each vertex is drawn as an 
integer grid point and each edge is drawn as a sequence of alternate horizontal 
and vertical line segments along grid lines as illustrated in FigQ (a). Every plane 
graph G with maximum degree < 4 has an orthogonal drawing. However, a plane 
graph with maximum degree > 5 has no orthogonal drawing. 

A box- orthogonal drawing of a plane graph G is a drawing of G on an integer 
grid such that each vertex is drawn as a rectangle, called a box, and each edge 
is drawn as a sequence of alternate horizontal and vertical line segments along 
grid lines as illustrated in Fig[n(b). Some of the boxes may be degenerated (i.e. 
points). A box-orthogonal drawing is a natural generalization of an orthogonal 
drawing. Every plane graph has a box-orthogonal drawing even if its maximum 
degree is > 5. The box-orthogonal drawing is studied in 
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Fig. 1. (a) An orthogonal drawing, (b) a box-orthogonal drawing, (c) a rectangular 
drawing, and (d) a box-rectangular drawing. 



An orthogonal drawing of a plane graph G is called a rectangular drawing 
if each edge of G is drawn as a straight line segment without bends and the 
contour of each face of G is drawn as a rectangle as illustrated in Fig^] (c). Since 
rectangular drawings have applications in VLSI floor-planning, this subject has 
been extensively studied in 






A hox-rectangular (BR ) drawing was introduced in [1 7j : It is a drawing of 
G on an integer grid such that each vertex is drawn as a rectangle, called a hox 
(which may be degenerated) and the contour of each face is drawn as a rectangle, 
as illustrated in Figd (d). (A degenerated box will be called a point. A non- 
degenerated box will be called a real hox) . Several applications of BR drawings 
are given in HZ|. A linear time algorithm for finding such a drawing is presented 
in nzi. Necessary and sufficient conditions for the existence of a BR drawing of 
a plane graph are also given in ini These conditions are rather complicated. 

The BR drawing defined in HH allows vertices be drawn as points. This is 
not desirable for applications such as floor-planning. A BR drawing is called 
a proper box rectangular (PBR) drawing if every vertex of G is drawn as a 
real box. In this paper, we establish necessary and sufficient conditions for a 
graph G to have a PBR drawing. We also present a linear time algorithm for 
constructing PBR drawings. Although BR drawing and PBR drawing are similar, 
our approach for solving this problem is totally different from that in . Our 
necessary and sufficient conditions are logically simple and clean. Our algorithm 
is conceptually simpler. With slight modification, our method can be applied to 
the BR drawing also. This provides another linear time algorithm and another set 
of necessary and sufficient conditions for a graph to have BR drawing. Moreover, 
our method can be easily adapted to solve other similar drawing problems. 
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2 Preliminaries 

Throughout the paper, G = (V, E) denotes a connected plane graph with no 
selfloops, but may have multiple edges. Let n = \V\ and m = \E\. The embedding 
of G divides the plane into a number of regions. The unbounded region is called 
the exterior face. Other regions are called interior faces. The degree of a face F 
of G is the number of edges on its boundary. 

Let G* = (V*,E*) denote the dual graph of G. For each edge e in G, e* 
denotes its dual edge. For a subset Ei C E, E* denotes the set of the dual edges 
in G* corresponding to the edges in Ei. If a cycle G of G contains k vertices, it 
is a fc-cycle. A triangle {quadrangle, resp.) is a 3-cycle (4-cycle, resp.) A cycle G 
divides the plane into its interior region and exterior region. If there is at least 
one vertex in its interior region, G is called a non-empty cycle of G. 

Consider a cut vertex v of G. Suppose v is an interior vertex. Let F be the 
interior face of G such that v appears on its boundary more than once. For each 
connected component D of G — {u}, the subgraph of G induced by V{D) U {?;} 
is called an extended component of G — {r;}. Let EI be an extended component 
of G — {?;} surrounded by F (see Fig 0(a)). It is impossible to draw G on the 
plane so that the face F’ is a rectangle since there is no room to draw the vertices 
in H. Suppose v is an exterior vertex. Let Hi be one extended component of 
G — {u} and H 2 be the union of other extended components of G — {u} (see Fig 
□ (b)). In any PER drawing of G, the box representing v must touch both the 
top and the bottom boundary (or both the left and the right boundary). Thus, 
in order to obtain a PER drawing of G, we can recursively And a PER drawing 
of Hi while designating v as two adjacent corners (this forces the rectangle for 

V spans the whole length of the drawing of Hi) and a similar PER drawing for 
H 2 ', then merge the rectangle for v in the drawing of Hi and the rectangle for 

V in the drawing of H 2 . This gives a PER drawing of G (see Fig 0(c)). Thus, 
without loss of generality, we will always assume G is biconnected from now on. 






V 




V 


H2 




Fig. 2. (a) Interior cut vertex v, (b) exterior cut vertex v, (c) drawings of Hi and H2. 



Our PER drawing algorithm is based on the concept of the rectangular dual 
defined as follows. Let i? be a rectangle. A rectangular subdivision system of i? is a 
partition of R into a set ^ = {i?i, . . . , Rn} of non-intersecting smaller rectangles 
such that no four rectangles in <I> meet at the same point. A rectangular dual 
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of a graph G = {V, E) is a rectangular subdivision system <l> and a one-to-one 
correspondence f : V ^ such that two vertices u and v are adjacent in G iff 
their corresponding rectangles f{u) and f{v) share a common boundary. Fig 0 
(c) illustrates a rectangular dual of the plane graph in Fig 0(b). 





(a) (b) (c) 

Fig. 3. Rectangular dual of plane graphs. 



Consider a plane graph H = (V,E). Let vq,vi,V2,v^ be four vertices on 
the exterior face of H in counterclockwise order. Let Pi {i = 0, 1,2,3) be the 
four paths on the exterior face of H consisting of the vertices between Vi and 
(addition is mod 4). We seek a rectangular dual Rh of H such that the 
four vertices uo,ui,f 2 ,t '3 correspond to the four corner rectangles of Rh and 
the vertices on Pq {Pi, P2, P3, resp.) correspond to the rectangles located along 
the north (west, south, east, resp.) boundary of Rh- Necessary and sufficient 
conditions for testing if H has a rectangular dual were discussed in pm. 

To simplify the problem, we modify El as follows: Add four new vertices 
Vn,Vw,Vs,Ve and connect {vw,Vs,Ve, resp.) to every vertex on Pq {Pi, P2, P3, 
resp.) Then add new edges {vn,v-w), {vw,Vs), {vs,Ve), {ve,Vn)- Let G be the re- 
sulting graph. (Fig0(a) shows a graph El and Fig0(b) shows the corresponding 
graph G). It is easy to see that H has a rectangular dual Rh with Vq,Vi,V 2 ,V 3 
corresponding to the four corner rectangles iff G has a rectangular dual R. 

If G has a rectangular dual R, then every face of G, except the exterior 
face, must be a triangle (since no four rectangles of R meet at the same point). 
Moreover, since at least four rectangles are needed to fully enclose some non- 
empty area on the plane, any non-empty cycle of G must have length at least 4. 
The following theorem was given in H2]. 



Theorem 1. A plane graph G = {V,E) has a rectangular dual R with four 
rectangles on the boundary of R iff the following conditions hold: (1) Every 
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interior face of G is a triangle and the exterior face of G is a quadrangle; (2) G 
has no non-empty triangles; and (3) G has no multiple edges. 

A graph satisfying the three conditions in Theorem ^ is called a proper tri- 
angular planar (FTP for short) graph. 

Theorem 2. inmi Given a plane graph G, in linear time, we can check the con- 
ditions in TheoremUl o,nd construct a rectangular dual of G if these conditions 
are satisfied. 

3 PBR Drawing with Designated Corner Vertices 

In this section we establish necessary and sufficient conditions for the existence of 
a PBR drawing of a plane graph G when the four corner vertices are designated. 
We also give a linear time algorithm for finding such a drawing if it exists. 

Let G be a biconnected plane graph. Let vq,vi,V2, V 3 be the four vertices on 
the exterior face (in counterclockwise order) of G designated as corner vertices. 
These designated vertices may be repeated. For example, if vq = vi, then in the 
PBR drawing of G, a single box occupying two corners represents the vertex 
Vo = v\. We use Pi {i = 0, 1, 2, 3) to denote the four paths on the exterior face 
of G consisting of the vertices between (and including) Vi and Ui+i. 

Definition 1. Let G be a biconnected plane graph with designated corner ver- 
tices vq,vi,V2,V3. The extended graph Gx is obtained from G as follows: 

1. Add four new vertices Vn, Vw,Vs,Ve and connect Vn {vw,Vs,Ve, resp.) to every 
vertex on Pq (^’i, ^ 2 , -P 3 , resp.) Add four new edges (w„, v,x)i{vw,Vs), {vs, vf), 
(ve,Vn). For each interior face F of G, add a new vertex vj? in F and connect 
it to the vertices on the boundary of F. 

2. For each edge e in the original graph G, delete e and add a new edge between 
V Fj and VF2 where Pi and F2 are the two faces of G with e on their boundary. 

Figi(b) shows the extended graph Gx of the graph G in Fig 0 (a). The 
vertices of G will be called original vertices. The vertices (edges, resp.) added 
in Step n are called added vertices (edges resp.) of Gx and denoted by Va {Fa, 
resp.) The edges introduced in Step El are called dual edges and denoted by Ed. 
Thus Ga; = (P U Va, Ea U Ed). We list the structures of Gx below. 

— After stepH every interior face of the resulting graph G^ = {V VJVa, E\J Eg) 
is a triangle, and the exterior face of Gq is a quadrangle. Consider any 
edge e = {u,v) in the original graph G. Let Fi and F2 be the two faces 
in G with e on their boundaries. In Go, e is a diagonal of the quadrangle 
Q = {u,v,vfi,vf2\- In Step 0 e = {u,v) is replaced by another diagonal 
{vfi,vf2) of Q. Thus, all internal faces of Gx are still triangles. 

— Each original vertex is incident only to the added vertices in Gx. For each 
added vertex vf that corresponds to an interior face F with degree k in G, 
the degree oi vf is 2 k in Gx, and the edges incident to vf alternate between 
the added edges and the dual edges. 
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Fig. 4. (a) G, (b) Gx (the added vertices are drawn as empty circles, the dual edges 
are drawn as light curves), (c) a rectangular dual of Gx which is also a PER drawing 
of G after removing the four rectangles corresponding to v„, v^, Vs, Vs- 



Lemma 1. G has a PBR drawing iff Gx has a rectangular dual. 

The proof is omitted due to space limitation. 

The extended graph Gx can be easily constructed in linear time from the 
embedding data structure of G. Thus by Lemma 0 Theorems 0and0 we have: 

Theorem 3. Let G he a biconnected plane graph with four designated vertices 
vq,vi,V 2,V3. In linear time, we can test whether G has a PBR drawing with 
vq,vi,V 2,V3 as the four corner vertices, and construct one if it exists. 

Next we present necessary and sufficient conditions for the existence of a 
PBR drawing solely in terms of G. An edge-cut of G = (V, E) is a minimal 
subset X C E such that the graph G — X = {V,E — X) is disconnected. If 
|A| = k, we say A is a fc-edge-cut. A 2-mixed-cut of G is a pair {v,e} where 
V gV and e G E such that G — {e} — {u} is disconnected. 

Lemma 2. Let G = {V,E) be a biconnected plane graph and X a 2-edge-cut or 
a 3-edge-cut of G. Then X contains either 0 or 2 exterior edges of G. 

Proof. Consider the dual graph G* of G. Let u* be the vertex in G* corresponding 
to the exterior face of G. For A: = 2 or 3, a subset A C if is a fc-edge-cut of G 
iff its corresponding set A* of dual edges is a fc-cycle in G*. If A contains an 
exterior edge of G, then v* is on the cycle A*. To complete the cycle, A* must 
contain exactly two dual edges corresponding to two exterior edges of G. So A 
must contain exactly two exterior edges. □ 

Let A = {61,62} be a 2-edge-cut (or A = {61,62,63} a 3-edge-cut, respec- 
tively). By Lemma El A can be classified as follows: 

— A is an interior edge-cut if all of its edges are interior edges; 

— A is a cross exterior edge-cut if 61 G Pi, 62 G Pj and \i — j\ = 2. 

— A is a corner exterior edge-cut if 6i G Pi, 62 G Pj and |A — j| = 1 or 3. 

— A is a 1-side exterior edge-cut if ei G Pi, 62 G Pj and i = j. 
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Theorem 4. A biconnected plane graph G has a PER drawing with vq, fi, t>2, '^3 
as the four corner vertices iff the following hold: 

1. All 2-edge-cuts of G are cross exterior edge-cuts; and 

2. All 3-edge-cuts of G are cross exterior or corner exterior edge-cuts; and 

3. G has no 2-mixed-cut. 

Proof. If: If we can show the extended graph Gx is a PTP graph, then G has a 
PER drawing by Theorem Q] and Lemma E As noted before, all interior faces of 
Gx are triangles and the exterior face of Gx is a quadrangle. 

Next we show Gx has no multiple edges. Towards a contradiction, suppose 
Gx has two multiple edges which form a 2 -cycle G. Suppose that G contains 
an added edge {v,vf) in Gx between an original vertex v and an added vertex 
vp. This can happen only if v appears on the boundary of the face F (of G) 
more than once. Then r; is a cut vertex of G and this contradicts the assumption 
that G is biconnected. Thus G must consist of two dual edges and e^. Let ei 
and 62 be the two edges in G corresponding to e* and ej, respectively. If both 
6i and 62 are interior edges of G, then {61,62} is an interior 2 -edge-cut of G. 
This contradicts our assumption. Suppose 61 is an exterior edge of G. Then G 
contains one of exterior vertices, say of Gx. Hence 61 G Pq- Since all dual 
edges incident to corresponds to edges in Pq, 62 must be in Pq also. Thus 
{61,62} is a 1 -side exterior 2 -edge-cut of G. This contradicts our assumption. 






Fig. 5. The illustrations of the proof of the “if” part of Theorem 0 



Next we show that Gx has no non-empty triangles. Towards a contradiction, 
suppose Gx has a non-empty triangle C. 

Case 1 : G contains an exterior edge of Gx, say (vn,Vyj). The other two edges 
of G must share a common end vertex w in Gx. The only original vertex adjacent 
to both Vn and in Gx is the corner vertex vi . However, the triangle consisting 
of the three edges (w„, Vw), {vn,vi), (v^, r'l) is an empty triangle in Gx. Thus the 
common vertex w must be an added vertex and the other two edges of C must 
be two dual edges 6} and 62, where the edge 6i corresponding to e{ is on Pg 
and the edge 62 corresponding to ej is on Pi (see FigEl(a)). Then {61,62} is a 
corner exterior 2 -edge-cut of G. This contradicts our assumption. 
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Case 2: C contains no exterior edge of Gx, but contains at least one added 
edge, say {v,vfi) between an original vertex v and an added vertex vpi in Gx- 
Since the original vertex v is adjacent to only the added vertices, G must consist 
of V and two added vertices and vp2 in Gx- Then the three edges of G 
are: (u,ufi), (^,^^2) and the dual edge e* = (vfi,vf2)- Let e be the edge of 
G corresponding to e*. Then {v,e} is a 2-mixed-cut of G (see Fig 0(b)). This 
contradicts our assumption. 

Case 3: G contains no added edges. Then C consists of three dual edges 
e*,C2,e3. Let 61,62,63 be the three edges in G corresponding to 63,62,63, re- 
spectively. If 61,62,63 are all interior edges, they form an interior 3-edge-cut of 
G. This contradicts our assumption. Suppose 61 is an exterior edge, say on Pq- 
Then C contains u„. Since all dual edges incident to Vn correspond to the edges 
in Pqi another edge of C must be in Pg- Then {61,62,63} form a 1-side exterior 
3-edge-cut (see FigO(c)), a contradiction again. 

Since all cases lead to contradictions, Gx has no non-empty triangles. So Gx 
is a FTP graph. 




Only If: We show that if any of the three conditions fails, then G has no 
PER drawing. Suppose that G has a 2-edge-cut X that is not a cross exterior 
edge-cut. There are three possible cases: 

— X consists of two interior edges 63,62. Let Pi,P2 be the two faces of G 
with 61 and 62 on their boundary. Let H\ and H2 be the two connected 
components of G — (ci, 62}. It is impossible to draw the faces Pi, P2 and the 
vertices in Hi and H2 as rectangles (see Fig0(a)). 

— X consists of two exterior edges 61, 62 located on the paths Pi and Pj where 
\i — j\ = 1 or 3. Let Pi be the interior face of G with 6i and 62 on its 
boundary. Let Hi and H2 be the two connected components of G— (ci, 62}. It 
is impossible to draw the face Pi and the vertices in Hi and H2 as rectangles 
(see FigEI (b)). 

— X consists of two exterior edges 63,62 located on the same path P^. Let Pi 
be the interior face of G with 61 and 62 on its boundary. Let Hi and H2 be 
the two connected components of G — {61, 62}. It is impossible to draw the 
face Pi and the vertices in P as rectangles. 
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Similarly, if G has a 3-edge-cut X = {ei, 62 , 63 } that is neither a cross nor a 
corner exterior edge-cut, or if G has a 2-mixed cut {u, e}, we can show G has no 
PER drawing (see Fig 0(c), (d) and (e).) □ 



4 PBR Drawing with No Designated Corner Vertices 

In this section, we consider the case where no vertices of G are designated as 
corner vertices. Let G = (V,E) be a biconnected plane graph. Let Go be the 
cycle that form the exterior face of G. Let Ui,U 2 , ■ ■ ■ ,Ut be the exterior vertices 
of G in counterclockwise order. Let ei , 62 , . . . , et be the exterior edges of G where 
6i = (ui,Ui+i) {I <i <t-l) and et = (wt,«i)- 

We want to choose four vertices Vq,Vi,V 2 , V 3 on Go so that G has a PBR draw- 
ing with vo,vi,V 2 ,V 3 as corner vertices. To make this possible, we must choose 
the corner vertices so that the conditions in Theorem 0 are satisfied. If this is 
possible we say G is feasible. In this section, we describe a linear time algorithm 
for checking if G is feasible. Note that the notions of 2-mixed-cuts and internal 
edge-cuts are defined solely in terms of the plane graph G, and are independent 
from the choice of vq,vi,V 2 ,V 3 . On the other hand, the notions of cross, corner 
and 1-side exterior edge-cuts depend on both G and the choice of vq,vi,V 2 ,V 3 . 

Let G* be the dual graph of G. Let u* be the vertex in G* corresponding to 
the exterior face of G. Let e* (1 < * < t) be the dual edge in G* corresponds 
to et- The enlarged dual graph of G, denoted by G*, is obtained from G* as 
follows: (a) Remove the vertex v* from G*; (b) add a cycle G* consisting of t 
new vertices v* (1 < z < t); (c) make the edge e* {I < i < t) incident to v* . See 
Fig 13 (a). Note that if we shrink the cycle G* into a single vertex (and delete 
selfloops), then G* becomes G*. The method described in this section can be 
presented using G* only. However, it is easier to illustrate the ideas by using G*. 
Note that each vertex v* (1 < z < t) on G* corresponds to the exterior edge 
of G, and each exterior edge on G* corresponds to the exterior vertex 

Ui on Co- An arc of G* is a continuous section of G*. Let A be an arc of G*. 
Later in this section by saying “choose a corner vertex in A” , we mean “choose 
a vertex in Go that corresponds to an edge in A as a corner vertex” . 

A bridge 2-path (3-path, resp.) of G* is a path P in G* consisting of two 
(three, resp.) edges such that its two end vertices are on G* and its internal 
vertex (vertices, resp.) are interior vertices of G*. Let G* be the subgraph of G* 
consisting of G* and the edges in all bridge 2- and 3-paths (see Fig0 (b)). It 
is easy to see that G* can be obtained by first constructing the subgraph of G* 
induced by the vertices on the cycle G* and the vertices adjacent to G*, then 
deleting all degree - 1 vertices. 

Note that a subset X Q E forms an exterior 2-edge-cut (3-edge-cut, resp.) 
iff the corresponding set X* of dual edges forms a bridge 2-path (3-path, resp.) 
of G*. For any bridge path P, the two end vertices u,v of P divide G* into two 
arcs Ai and A 2 (zt and v belong to both Ai and A 2 ), which are called the two 
arcs defined by P. 
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Fig. 7. (a) A graph G (drawn as solid lines and dark circles) and the corresponding 
graph G* (drawn as empty circles and light lines); (b) the graph G*. 



Let P be a bridge 2-path consisting of two dual edges e*,e*. Then X = 
{ei,ej} is a 2-edge-cut of G. Let Ai and A 2 be the two arcs defined by P. In 
order to make X a cross exterior edge-cut of G, we must choose two corner 
vertices on A\ and two corner vertices on A 2 . 

Let P be a bridge 3-path consisting of three dual edges e*,e*,e*. Then X = 
{ei, Cj, e} is an exterior 3-edge-cut of G. Let A\ and A 2 be the two arcs defined 
by P. In order to make X a corner exterior edge-cut of G, we must choose either 
exactly one corner vertex on Ai or exactly one corner vertex on A 2 . In order to 
make X a cross exterior edge-cut of G, we must choose two corner vertices on 
Ai and two corner vertices on A 2 . 

As we have seen, each exterior 2- or 3-edge-cut X of G corresponds to a bridge 
2-path or 3-path of G*. To ensure that X satisfies the conditions in Theorem 
01 it demands either 1 or 2 corner vertices on certain arcs. Since we can choose 
exactly four corner vertices, if the total demand for corner vertices exceeds 4, 
then G is not feasible. In the following, we describe how to verify this condition. 

Let P be a bridge path and Ai and A 2 be the two arcs defined by P. We say 
that another bridge path P' is on Ai if both end vertices of P' are in Ai . P is 
called a primary bridge path if no other bridge paths are on one arc defined by 
P, and this arc is the empty arc of P. (If there is only one bridge path P in G*, 
then P is a primary bridge path and both arcs defined by P are empty arcs.) 

Let (?2 be the subgraph of G* consisting of the cycle G* and all bridge 2- 
paths. Let H be the graph obtained from GJ as follows. For each internal face 
P in G 2 , there is a node vp in H . Two nodes and vp^ are adjacent in H iff 
the two corresponding faces F\ and P 2 share a common boundary in G^. G 2 is 
a ladder if P is a path. The two faces of G 2 corresponding to the two end nodes 
of H are called the two end regions of G^. 

For an example, consider the graph shown in Fig 0(b). G* has four primary 
bridge paths: Pq, Pi, P 4 and P 5 . The corresponding graph G 2 consists of the 
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cycle C* and the bridge 2-paths Pq and P3. The corresponding graph iJ is a path 
consisting of three nodes. So GJ is ladder. One end region of G 2 is the region 
bounded by the arc {f* , fn, '*^ 9} and the path Pq. The other end region of 
G2 is the region bounded by the arc {ug, U 5, Ug, Uy} and the path Pg. 

Theorem 5. G is feasible iff one of the following two eonditions holds: 

1. G* has no bridge 2-paths and there are at most four primary bridge 3-paths. 

2. G* has at least one bridge 2-path, and (a) GJ is a ladder; and (b) All primary 

bridge 3-paths are in the two end regions of G 2 ; and (c) Eaeh end region 

eontains at most two primary bridge 3-paths. 

Proof. (1) Assume G* has no bridge 2-paths. Suppose that there are at most 
I < 4 primary bridge 3-paths. We can choose one corner vertex on the empty 
arc of each primary bridge 3-path. (If I < 4, the other 4 — I corner vertices can 
be chosen arbitrarily). This way, for any bridge 3-path P, at least one corner 
vertex is chosen on each of the two arcs defined by P. Thus G is feasible. 

Suppose that there are more than four primary bridge 3-paths. Regardless 
how we choose the four corner vertices, for at least one primary bridge 3-path 
P, no corner vertex is chosen on the empty arc of P. Thus G is not feasible. 

(2) Assume G* has at least one bridge 2-path. 

Suppose the conditions 2 (a), 2 (b) and 2 (c) hold. We can choose one corner 
vertex on the empty arc of each primary bridge 3-path. If less then four corner 
vertices are chosen, we can choose the remaining corner vertices so that exactly 
two corner vertices are chosen in each of the two end regions of GJ. This way, 
exactly two corner vertices are chosen in each arc defined by every bridge 2- 
path, and at least one corner vertex is chosen in each arc defined by every bridge 
3-path. Hence G is feasible. 

Suppose the condition 2 (a) fails. We must choose 2 corner vertices on the 
empty arc defined by every bridge 2-paths. Since G* is not a ladder, there are at 
least 3 primary bridge 2-paths in G2. So we must choose at least six 6 vertices. 

Suppose the condition 2 (b) or 2 (c) fails. When we try to choose at least 
1 corner vertex on the empty arc defined by each primary bridge 3-path, and 
exactly 2 corner vertices on each arc defined by each bridge 2-path, we end up 
choosing at least 3 corner vertices on an arc defined by a bridge 2-path. 

Thus if one of the conditions 2 (a), 2 (b) or 2 (c) fails, G is not feasible. □ 

In the graph G* shown in Fig0 (b), the condition 2 (b) fails and hence the 
corresponding graph G is not feasible. The proof of the following theorem is 
omitted. 

Theorem 6. We ean test whether a bieonneeted plane graph G is feasible in 
linear time. 
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Abstract. We present a labeling scheme for rooted trees which allows 
to compute, from the label of v alone, nnique identifiers for the ancestors 
of V that are at distance at most d from v. For any constant d our labeling 
scheme produce labels of length log n + 0(Vlog n), and for d € 0{\/\og n) 
the labels are still of length O(logn). 

In particular, given the labels of two nodes u and v we can determine 
from the labels alone whether u is the parent of v or vice versa, whether 
u and V are siblings, and whether u and v are at distance at most d from 
each other. 

The need for such labeling scheme arises in several application areas, 
including in particular communication networks and search engines for 
large collections of Web XML files. In the latter application XML files are 
viewed as trees, and typical queries ask for XML files containing a par- 
ticular set of nodes with specific ancestor, parent, or sibling relationships 
among them. 



1 Introduction 

We present an algorithm that given a tree T assigns a unique identifier, id(n), 
and a unique label, lab(n), for every v € T. These identifiers and labels have 
the property that given the label of a node v we can compute from this label 
alone the identifiers of all ancestors of v which are at distance at most d from 
V. The maximum length of a label or identifier that our algorithm produce is 
logn + 0{dy/\og n) bits, where n is the number of nodes in the tree. Thus for 
every constant d our labels are of length logn + 0{yyiogn). This is close up to 
a second order additive term to the minimum of log n bits required to identify 
the nodes. Given the label of v we can construct the labels of all d ancestors of 
V starting from v in 0(1) time per ancestor. 

In particular, our labeling scheme (with d = 1) allows parent and sibling 
queries. I.e. given the labels of two nodes u and v we can determine in constant 
time whether one is the parent of the other, and whether they are siblings (have 
a common parent). Furthermore, given the labels of two nodes u and v we can 
determine whether the distance between u and v is at most d. In case the distance 
is indeed at most d we can calculate it exactly from the two labels. This latter 
result is in contrast with a recent lower bound of l7(log^ n) on the maximum 
label length if the labeling scheme allows to compute the distance between any 
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pair of nodes |^. Our result of this paper shows that if we limit ourselves to 
small distances then much shorter labels suffice. 

Our labeling scheme is such that from the label of v alone we can compute 
the identifiers of the d closest ancestors of v. Since according to our scheme the 
label of a node v contains its identifier we can use it to determine if a node 
u is an ancestor of a node v at distance at most d from v from the labels of 
u and 43 In contrast, one may be interested in a labeling scheme that allow 
to determine whether u is an ancestor of v, using the labels of both v and u, 
for any pair of nodes u and v without any restriction on their distance. We 
call a labeling scheme that allows such queries a labeling scheme for ancestor 
queries. Notice that a labeling scheme for ancestor queries does not have any 
immediate implications for distance queries. The labeling scheme we describe in 
this paper does not support arbitrary ancestor queries. However, we can combine 
the technique presented in this paper with a recent labeling scheme for ancestor 
queries suggested by Abiteboul et al |2 and subsequently improved by Alstrup 
and Rauhe 0. The resulting labeling scheme will then allow to identify the 
d closest ancestors of any node, and, additionally, to answer ancestor queries 
between every pair of nodes. Although the combined scheme is more complicated 
than the one we present here, the maximum label length is still log n+0{d\/\ogn) 
(the constant factors are somewhat worse.) 

Our result builds upon the tree decomposition technique developed by Abite- 
boul at el in ^J. We decompose the tree by applying the tree decomposition 
algorithm recursively \/log n times. Based on the resulting decomposition of the 
tree we assign unique identifiers and labels to the nodes. The identifier of a node 
u is a concatenation of identifiers of the representatives of v from the -^log n re- 
cursive levels. The main technical contributions of the paper are the followings. 

1) We separate the notions of an identifiers of a node and a label of a node, and 

2) We develop a technique that exploits the recursive structure of the identifiers 
and labels to compute the identifiers of the ancestors of v from the label of v. 
To simplify the presentation, we demonstrate our technique in this extended ab- 
stract using particularly simple identifiers that allows to identify the ancestors 
of a node v that are at distance at most d from v. But as already mentioned, we 
can combine our technique with more complicated identifiers derived from the 
work of Abiteboul et al Pj and Alstrup and Rauhe 0, to obtain a scheme that 
also supports ancestor queries. 



1.1 Related Work 

Early work on labeling schemes for graphs focused on adjacency labeling schemes 
where we want to determine whether u and v are adjacent based solely on their 
labels. Kannan, Naor, and Rudich m suggested O(logn) adjacency labeling 
schemes for a number of graph families, including trees, and various intersection 
graph families. 

^ We compute from the label of v the identifiers of its d closest ancestors and check 
whether one of them is the identifier of u. 
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For trees an adjacency labeling scheme is a labeling scheme that allows parent 
queries. Kannan et al HD! suggested the following simple scheme which we call 
the pair scheme. According to the pair scheme we number the nodes of T in 
some arbitrary order, by consecutive integers starting from 1. Then we label 
each internal node u by a pair {n{y) , n{p{v))) , where n{v) and n(p(v}} are the 
numbers of v and its parent respectively. Clearly node u is a parent of node u if 
v’s first attribute equals to u’s second attribute. It is easy to see that the length 
of the labels generated by this scheme is at most 21og(n). The labeling scheme 
we suggest in this paper (with d = 1) improves substantially on the 2 log n bound 
of Kannan et al, and produce labels of length logn + 0{y/\og n) bits. One can 
view our technique as an extension of the simple pair scheme. At a high level 
our algorithm partitions the tree into small subtrees and uses the pair scheme 
within each such subtree with special care of how to glue these independent pair 
schemes together. 

The notion of adjacency labeling scheme has been extended by Peleg |0| who 
studied various distance labeling schemes. A distance labeling scheme allows to 
determine, or at least approximate, the distance between u and v from their la- 
bels. Peleg PI gave a distance labeling scheme for trees producing labels of length 
0(log^ n), and an approximate distance labeling scheme for general graphs. The 
approximate scheme for general graphs produce distances that are accurate up 
to a factor of \/^ with labels of length 0(log^ nkn^^^). Further extensive work 
on distance labeling schemes has been done by Gaviolle et al p| and Peleg et 
al |H|. These papers establish lower and upper bounds on the label length for 
several graph families. In particular Gaviolle et al pj show a lower bound of 
I2(log^ n) on the label length of any distance labeling scheme for trees. Our re- 
sult of this paper shows that if we limit ourselves to small distances then labels 
much shorter than 0(log^ n) suffice. 

Recent work has also been focused on labeling scheme for ancestor queries. 
A simple such labeling first suggested by Santoro and Khatib El produce labels 
of length at most 2 log n. Abiteboul et al P] have recently presented a labeling 
scheme for ancestor queries with labels of length | logn -I- O(loglogn). Alstrup 
and Rauhe P| subsequently showed how to improve the bound of Abiteboul et al, 
to logn -I- 0{y/log n), by incorporating alphabetic trees |Z] into their algorithm. 
A similar scheme has also been discovered independently by Zwick and Thorup 
Pg in the context of a routing application. As already mentioned our result 
builds upon the tree decomposition technique of Abiteboul et al as presented in 
0. We can obtain a labeling scheme for parent queries by taking the ancestor 
labeling scheme of Alstrup and Rauhe |2j and add to the label of v the height 
of V in the appropriate subtree that contains it. The label length would still be 
logn -I- 0{y/logn). This approach, however, does not seem to extend to allow 
sibling queries or distance queries, even for small distances. 

1.2 Motivation 

Applications for most of the labeling schemes that we mentioned in Section II . 1 1 
as well as the new schemes suggested in this paper arise from two main areas. 
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The first is routing in communication networks and the second is the design of 
search engines for large collections of XML files. 

Communication Networks. Distance labeling schemes are useful in situations 
when a router knows the labeQ of the destination but does not know or does 
not have the time to access the topology of the entire network. Such router may 
want to choose between several routing options (or routing algorithms) based 
on the distance to the destination. In case the labels are carefully designed to 
allow distance or limited distance queries, the router can figure out the distance 
to the destination without accessing the topology of the network. A concrete 
application of distance labeling schemes for connection establishment in ATM 
networks is described in lOj. 

A distance labeling scheme can also be useful to limit fiooding during broad- 
cast or multicast of information. In such context a router may want to switch to 
unicast or TTL (Time To Live) based broadcast when all destinations reachable 
from it are relatively close. A close relation between ancestor labeling schemes 
and routing in trees is suggested by the work of Zwick and Thorup [I2|. 

Note that when the label of the destination (or labels of the destinations) is 
transmitted with the packet, long labels will increase the bandwidth. Each node 
also stores the labels of all destinations it may need to send data to, so if the 
labels are long these tables may become too large. Consequently, one would like 
to use as compact labels as possible. 

XML search engines. The Web is constantly growing and contains a huge 
amount of useful information. To retrieve such data, people typically use search 
engines which provide full-text indexing services (the user gives a few words and 
the engine returns documents containing those words) . The new emerging XML 
Web-standard m allows for the development of advanced search engines that 
support more sophisticated queries. XML allows to describe the semantic nature 
of the document components, enabling users not only to ask full-text queries but 
also utilize the document structure to ask for more specific data ISI 

The key observation is that Web documents obeying the XML standard can 
be viewed as trees (basically the parse tree of the document). Typical queries over 
such documents then amount to testing various inclusion and direct inclusion 
of document items, which correspond to ancestor, parent, sibling, and similar 
relationships among the analogous tree nodes mi. 

XML Web-search engines process such queries using an index structure that 
summarizes this structural information nn|. Each node in the document tree 
is represented in the index by some logical id (label), with the labels being 
assigned to the nodes such that, given the labels of two nodes, the index can 
determine fast (by looking only at the labels and without any access to the 
actual file) whether one node is the parent or the ancestor of the other. To 
provide good performance it is essential that the index structure (or at least a 
large part of it) resides in main memory. Observe that we are talking here about 
huge numbers - thousands of giga bytes of labels 0 . Since a main factor which 
determines the index size is the length of the labels, reducing this length, even 

^ Labels can also be used to identify the nodes in the network. 
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by a constant factor, is a critical issue, contributing both memory cost reduction 
and performance improvement. 

We use the RAM model of computation, and assume that the size of a com- 
puter word is f?(log n) (hence the basic operations on the labels can be performed 
in constant time). In particular we assume that the RAM model includes the 
ability to perform bit manipulations like shift, bitwise AND, OR, XOR, and 
base- two discrete logarithm in constant time. 

The paper is organized as follows: In section 0 we recalling the main princi- 
ple of the tree decomposition algorithm of 0. Section 0 demonstrates our tech- 
niques by giving a very simple scheme for parent queries using just one iteration 
of the tree decomposition algorithm. Section 0 shows how to reduce the length 
of the labels by adapting our technique to recursive pruning of the tree. Finally 
we show in Sectional how to extend our ideas so we can compute identifiers for 
d closest ancestors. We conclude and suggest directions for further research in 
Sectional Most proofs are omitted from this extended abstract. 

2 The Prune&Contract Algorithm 

In this section we give an overview of the tree decomposition algorithm of 0 
and its properties. This will serve later as the basis of our new labeling scheme. 
We refer to this algorithm as the prune & contract algorithm. We assume that the 
input tree is such that each internal node has at least two childrer0. We denote 
by n the number of nodes in a tree T and by I the number of leaves. Since we 
assume that each node in T has at least two outgoing edges n <21. 

The algorithm uses a parameter x that controls the size of the pruned sub- 
trees, and has three phases. In our description below we omit some details that 
can be found in 0 . 

— The first phase prunes from T all subtrees having leaves or less, by cutting 
the edges leading to these subtree^. For example consider the tree in Figure 
0a). The tree has 27 leaves. When a; = | all subtrees containing three or less 
leaves are prunes. The dashed lines in FigureQ indicate the pruned subtrees. 

— The second phase groims the pruned subtrees into forests each having be- 
tween P and 2P leave^. We do the grouping such that the subtrees in each 
forest are either all children of a single node (as in the case of the forests 
B-E in Figure 0b) ), or they are the set of children of some path of nodes 
(as in the case of forest A; the nodes of the path here are denoted by ai-aa). 
The paths are such that their tail node is not a leaf of the remaining tree 
(in the above example 03 is not a leaf - it has a non pruned child). 

^ If this is not the case to begin with we add a child to each internal node having 
only one child. This transformation at most doubles the number of nodes. (Since the 
labels lengths are in term of logn this adds at most one one bit to the label). 

^ For some technical reasons we in fact also cut some subtrees having more than E 
but still at most 2P, for details see 0 . 

® Some limited number of forests may in fact contain a smaller number of leaves. 
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— The third phase adds to the remaining part of the tree a representative-leaf 
per each pruned forest. When the parents of the trees in the forest form 
a path we first contract the path into a single node and attach the new 
representative-leaf to that node. Otherwise we attach the leaf to the parent 
of the forest. An exception is when there is only one single forest below 
such a parent node. In this case rather than adding a new leaf, the parent 
itself serves as the representative of the forest. To continue with the above 
example, in Figure Q^c) the leaves a-e represent the forests A-E resp., and 
the node a* corresponds to the contracted path 01 - 03 . The nodes a,c — e are 
new leaves, while 6 belonged to the original tree. 




Fig. 1. The pruning algorithm 



Let T denotes the tree resulting from the above algorithm after pruning, con- 
traction, and adding representatives. The following Lemma proved in P] bounds 
the number of leaves in T and the length of the contracted paths. We shall use 
these lemma to establish the bound on the label length. 

Lemma 1. m (1) The length of each path contracted while constructing T is at 
most 2E . (2) The number of leaves in T is at most where I is the number 

of leaves in T. 

We shall use the following definition of the representative of a node to define 
our labeling schemes. 

Definition 1. Let v be some node in T. The representative node of v in T, 
denoted by r{v), is defined as follows. If v was contracted as part of a path 
into some node c in T, or belongs to some pruned forest under such path, then 
r(v) = c. If V was pruned from T (but not under a contracted path) then it is 
represented in T by the representative-leaf corresponding to the forest to which 
V belongs. Otherwise, i.e. ifvGTDT, then r{v) = v. 

Finally, for each node w € T that represents pruned vertices we denote by F{w) 
the forest of nodes of T that w represents. 

We number each contracted path sequentially starting from 0 such that nodes 
closer to the root have larger numbers. For a vertex v that belongs to a contracted 
path we define the index of v as its sequential number on its contracted path. 
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3 A Simple Two Level Parent Labeling Scheme 

First we use the prune&contract algorithm with a parameter x that will be 
determined later to decompose the tree into pruned subtrees of size about 
and a remaining tree T of size about n^~^. We number the nodes of T and 
the nodes of each pruned forest independently. Using these numbers we define a 
unique identifier id(u) for each v G T. The identifier of v consists of two fields. 
The first field is the number of r{v) in T. If w is in a pruned subtree then the 
second field contains the label of v in its pruned subtree. Otherwise, the second 
field contains the index of v on its contracted path (or zero in case v is not on a 
contracted path) . The second field also contains a bit which is set iff it contains 
a number of a pruned node rather than an index or zero. 

The label of v, lab(u), consists of id(u) together with an additional infor- 
mation that allows us to compute the identifier of the parent of v from the 
identifier of v. The definition of this additional information follows from the 
following lemma. 

Lemma 2. During the prune&contract algorithm exactly one of the following 
cases holds. 

1. The vertices v and p{v) are on the same contracted path. In this case v and 
p{v) share the same representative in T and the index of p{v) is one greater 
than the index of v. 

2. Vertices v and p{v) are in the same pruned subtree. In this case v and p{v) 
share the same representative in T and have different numbers in the num- 
bering of their pruned subtree. 

3. Vertex v is in a pruned subtree and vertex p{v) is on a contracted path. In 
this case v and p{v) share the same representative in T. 

4 . Vertex v is in a pruned subtree and is represented by p{v). 

5. Vertex v is either (1) the last node on a contracted path, (2) a root of a 
pruned subtree that is represented by a new leaf added to T, or (3) belongs 
to T. Vertex p{v) either belongs to T or is the first vertex on a contracted 
path. In this case r{p{v)) is the parent of r{v) in T. 

The structure of lab(r!) is as follows. In addition to id(v), lab(v) contains a 
merge bit which is set iff v and p(v) have a common representative in T (cases 
(1-4) above). In case v and p{v) do not have a common representative (case (5)) 
in T we also keep the number of r{p{v)) in T. Otherwise, if v and p{v) do have 
a common representative we also keep two bits indicating which of cases (1), 
(2), (3), or (4) occurred. When case (2) occurred we also keep the number of 
p(v) in F(r(p(v))), and when case (3) occurred we keep the index of p{v) in its 
contracted path. 

It is rather straightforward to check now that given lab(u) we can compute 
id(p(u)): If the merge bit is not set we obtain id(p(u)) by storing the number of 
r{p(v)) in T in the first field of id(p(u)) and writing zero into the second field. 
If the merge bit is set then the way we obtain id(p(v)) depends on which of the 
cases (I), (2), (3), or (4) happens. In case (I) we obtain id(p(u)) by incrementing 
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the index in the second field of id(t;). In case (2) we obtain id(p(z;)) by replacing 
the content of the second field in id(?;) with the number of p{v) in its pruned 
forest. In case (3) we replace the second field in id(n) with the index of p{v) on 
its contracted path and turn off the prune bit. In case (4) we simply write zero 
in the second field. 

The length of the Labels. By inspecting the definition of the labels it is easy 
to see that the largest labels are in cases (5), (3), and (2). One can bound the 
length of the label in each of these cases using Lemma d To balance out the 
lengths of these three kinds of labels we pick the pruning parameter to be 1/2. 
The maximum length of a label is then | logn + 0(1). 

In summary we proved in this section the following theorem 

Theorem 1. The labeling and identification scheme presented in this section 
produce labels of length | logn + 0(1) bits such that from the label of a node v 
we can obtain the identifier of p{v) in constant time. 

The following is an immediate corollary. 

Corollary 1. Using the labeling scheme described in this section one can deter- 
mine from the labels of u and v alone whether one is the parent of the other, and 
whether u and v have a common parent. 

4 Smaller Labels via Recursion 

In this section we show how to obtain shorter labels for parent queries by ap- 
plying the prune&contract algorithm recursively. As in Section 0 we define a 
unique identifier id(u) for each node v. The label of v consists of the identifier 
of V together with additional information that allows us to compute the identi- 
fier of p{v) from the identifier of v. By Applying the prune&contract algorithm 
recursively k times, we obtain k levels of pruned forests and a final remaining 
tree T^. To identify a node v we need a number from each pruning level. How- 
ever, taking advantage of the locality of the parent relation, we will see that 
to produce the identifier of the parent of v it suffices to store one additional 
piece of information from one particular pruning level. We first recall the recur- 
sive prune&contract process which is similar to the one of 0. Then we explain 
how to use it in order to construct the identifiers and the labels. By setting 
k = ^/\ogn and fixing appropriate pruning thresholds we obtain labels of length 
at most logn + 0{\/logn). 

Recursive pruning. A prune&contract process with k recursive levels works as 
follows. Let x\, . . . , Xkhe & sequence of numbers that we shall fix later. First we 
process T using the prune&contract algorithm with x\ as the pruning threshold. 
Let Ti = T be the resulting tree. Then we apply the prune&contract algorithm 
again to Ti with a threshold X 2 and denote the resulting tree Ti by T 2 . We 
continue recursively and at each step define Ti, 1 < i < k, to be the tree Ti_i 
obtained by applying the prune&contract algorithm to Ti_i with threshold Xi. 
(With To being the original tree T.) We extend the definition of a representative. 
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r{v), of a node v G T from Section 0such that = v and r^{v) = r(r* ^(n)) 
for any 1 < i < A:. 

As before we number the nodes of consecutively starting from one. Then 
we do the same for the pruned forests: For each leaf or contracted node w G 
such that F(w) is not empty we number the nodes in F(w) consecutively starting 
from one. Based on these numberings we define the identifiers and the labels of 
the nodes of T. 

We start by defining the identifier of a node v, id(u). The identifier of v 
consists of A: + 1 fields ordered from field k down to field 0, and denoted by 
fk, . ■ ■ , fo- Field fi corresponds to the i-th iteration of the prune&contract algo- 
rithm. Field fi starts with a prune bit that is set iff F{v) was pruned. If F(v) 
was pruned then this bit is followed by the number of r*(u) in its pruned forest. 
Otherwise, the prune bit is followed by the index of F(v) on its contracted path 
or by zero if F(v) was not contracted. It is easy to prove by induction on the 
number of iterations of the prune&contract algorithm that these identifiers are 
indeed unique. 

The label of v consists of the identifier of v together with additional informa- 
tion that is needed to produce the identifier of the parent of v from the identifier 
of V. To define that additional information we need the following definition of 
the merge-level of a vertex v, and the subsequent lemmas. 

Definition 2. We define the merge-level of a vertex v G T as the maximum j 
such that F{v) F(p(v)). We denote the merge level of a vertex v by m{v). 

Let m = m(y), the following lemma characterizes the possible relations be- 
tween r’”(u) and r'"(p(w)). 

Lemma 3. If m = m{v) < k then exactly one of the followings occurred, 

1. Vertices r™'{v) and r^{p{v)) are in the same pruned subtree ofTm- 

2. Vertices r"*(u) and r'^{p{v)) are in the same contacted path ofT^. 

3. Vertex r™(u) is in a pruned subtree of T™ and vertex r^(p(v}) is on a 
contracted path of . 

4- Vertex r™(u) is the root of a pruned subtree of T™ and r^{p{v)) = 
r’”+^(p(u)) is a leaf o/T^+i representing v. 

If m{y) = k then r^{p{y)) is the parent ofr^{v). 

The following lemma characterizes what can happen to F(v) and F(p(v)) 
when i < m(v). Its proof follows from the definition of m(v) and the definition 
of the prune&contract algorithm. 

Lemma 4. (i) For every i < m{v) either F{p{v)) = r®+^(p(ti)) or F{p{v)) is 
the first node on a contracted path to which F'^^{p{v)) corresponds. 

(ii) For every i < m{v) either (1) F{v) = F'^^{v), (2) F{v) is the last vertex on 
a contracted path to which corresponds, or (3) F{v) is a leaf added 

to Ti to represent a pruned forest, F~^{v) is a root of a tree in that forest. 
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Now we are ready to define the additional information, denote by addi(t;), that 
we add to id(t;) in order to produce the identifier of p{v) from the identifier of 
V. The label of v consists of both id(u) and addi(w). The first field of addi(u) 
contains the merge-level of v. We denote this merge level by m. If m(y) = k 
then the second field contains the number of r{p{v)) in T^. Otherwise the second 
field contains two bits indicating which of the cases in lemma |5| occurred. If Case 
(I) occurred then addi(u) contains also the number of r’^(p(v)) in its pruned 
forest. If Case (3) occurred then addi(u) contains also the index of r’^(p(v)) on 
its contracted path. 

Given id(u) and addi(u) we compute id(p(u)) as follows. If the merge level 
of u is A: then field fk of id(p(u)) contains the number of r^{p(y)) in Tk which is 
stored in addi(u), and all fields fi for z < fc of id(p(u)) are zero. So let m = in{v) 
and assume m < k. By the definition of the identifiers, and the definition of the 
merge level, fields fi for z > m of id(p(u)), are the same as the corresponding 
fields of id(u). By the first part of Lemma 0 fields fi of id(p(u)) for i < m are 
zero. To compute field fm of id(p(u)) we check addi(u) to see which of the cases 
of LemmaOloccurred at iteration m of the prune&contract algorithm. If Case (1) 
occurred then field fm of id(p(u)) contains the number of r’^{p{v)) in its pruned 
forest. We copy this number from addi(u) where it is stored, and also turn on 
the prune bit of this field. If Case (2) occurred we obtain field fm of id(p(u)) by 
adding one to the index stored in field fm of id(u). If Case (3) occurred field fm 
of id(p(u)) contains the index of r'^{p{v)) in its contracted path that is stored 
in addi(u), with the pruned bit turned off. If Case (4) occurred then field fm of 
id(p(u)), including the pruned bit is zero. 

The length of the labels. Each field in the label is of fixed length that is 
determined by the maximum possible value that can be assigned to that field. 
Since addi(u) contains information from a single but arbitrary recursive level it 
follows that the total label length is minimized if we choose the pruning thresh- 
olds Xi,.--,Xk to be the same. The label length then is (1 -I- log n + 0{k). 
Therefore by setting k = y/\ogn we get labels of length log n -I- 0{^/\ogn). Since 
we assumed that a computer word can hold log n bits then it is straightforward 
to check that we can compute id(p(u)) from id(u) in constant time. 

The following theorem and its corollary summarize the results of this section. 

Theorem 2. The identification and labeling scheme we presented in this section 
assigns identifiers and labels of length logrz -|- 0(\/log n) to the nodes of a tree 
such that given the label of a node v we can compute the identifier of p{v) in 
constant time. 



Corollary 2. One can label the nodes of a tree T with labels of length logzz -|- 
0{y/\ogn) such that given the labels of u and v we can determine from the labels 
alone whether one is the parent of the other, and whether u and v are siblings. 



256 H. Kaplan and T. Milo 



5 Obtaining the Identifiers of the d Closest Ancestors 

We can extend the algorithm from the previous section to produce labels from 
which we can compute the identifiers of all ancestors of v which are at distance at 
most d from v. We use the recursive prune&contract process and the identifiers 
as defined in Section^ The label of v contains id(u) and an extended additional 
field, addid(r'), that will allow to compute the identifiers of d ancestors of v. The 
extended label of a vertex labd(?;) consists of id(z)) and addid(u). 

The field addid(r’) is a concatenation of the fields addi(p*(t!)), for 0 < i < 
d — 1. Given id(ti) and addi(f) we use the algorithm of Section 0 to compute 
id(p(u)). Then we take id(p(z;)) and addi(p(r;)) and compute id(p^(u)) using the 
same algorithm. We continue in a similar way d times until we have computed 
id(p‘^(ti)). Since each of the fields addi(p*(?;)) is of length 0{y/logn), the length of 
addid(u) is 0{dy/logn), and our labels for computing the identifiers of d ancestors 
are of length log n+0{dy/\og n). We conclude with the main result of this Section. 

Theorem 3. There are sets of unique identifiers id(u) and unique labels lab(r;) 
that one can assign to the nodes of the tree T such that from lab(z)) we can 
compute id(p* (■(;)) for 1 < i < d. The maximum length of an identifier or a label 
is logn + 0{dy/\og n) bits. 

Our labeling scheme can be used for distance queries. Given the label of u and 
the label of v we can determine whether u and v are at distance at most d from 
each other. We do that by producing the identifiers of the d closest ancestors of 
V and the d closest ancestors of u. If there is no intersection among these lists 
of identifiers then we conclude that the distance between u and v is larger than 
d. Otherwise we find the identifier, id(z), of the lowest common ancestor, z, of 
u and V. By summing up the sequential number of z on the list of the ancestors 
of V and the sequential number of z on the list of the ancestors of u we obtain 
the distance between u and v. Notice that if u and v are of distance at most d 
from each other then their lowest common ancestor is of distance at most d from 
each of them and therefore it will show up in the list of the d closest ancestors 
to V and in the list of the d closest ancestors to u. So we have established the 
following corollary of Theorem 0 

Corollary 3. The labeling scheme of Theorem 0 allows to determine the dis- 
tance between u and v if this distance is less than d or figure out that the distance 
between u and v is larger than d. This labeling scheme also allows to determine 
whether u is an ancestor of v and at distance j from v for any j < d. 



6 Conclusions 

We have presented an algorithm that assigns identifiers and labels to the nodes 
of a tree of length 0{logn + dy/log n) such that given a label of a node v one 
can compute the identifier of the d immediate ancestors of u. In particular our 
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labeling scheme allows to determine the distance between u and v from their 
labels if this distance is less than d, or figure out that the distance is larger than d. 

When d is small then our labels are particularly short but as d gets larger 
our labels may become even linear in the number of nodes. In contrast, Peleg 
in IP shows a labeling scheme for trees whose labels are of length 0(log^ n) 
which allows to determine any distance between u and v. So one may try to 
develop a labeling scheme, that works the same for every d, and is as efficient as 
ours for small d but degrades to no more than 0(log^ n) for larger values of d. 
Another line for further research is to implement our algorithms and test their 
performance on real XML data. 
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Abstract. We consider the problem of computing the product of two 
n X n Boolean matrices A and B. For an n x n Boolean matrix C, let Gc 
be the complete weighted graph on the rows of C where the weight of an 
edge between two rows is equal to its Hamming distance, i.e., the number 
of entries in the first row having values different from the corresponding 
entries in the second one. Next, let MWT(C) be the weight of a minimum 
weight spanning tree of Gc- We show that the product of A with B as 
well as the so called witnesses of the product can be computed in time 
0(n(n + min{MTUr(A), MWT{B*)})) Q. 

1 Introduction 

Since Strassen published his first subcubic algorithm for arithmetic matrix mul- 
tiplication a lot of work in this area has been done. The best asymptotic 
upper bound on the number of arithmetic operations necessary to multiply two 
nxn matrices is presently due to Coppersmith and Winograd jS]. Since 

Boolean matrix multiplication is trivially reducible to arithmetic 0 — 1-matrix 
multiplication the same asymptotic upper bound holds in the Boolean case. 
Unfortunately, the aforementioned substantially subcubic algorithms for arith- 
metic matrix multiplication are based on algebraic approaches difficult to imple- 
ment. 

In Pj, Schnorr and Subramanian have shown that the Boolean product of 
two nxn random Boolean matrices can be determined by a simple combina- 
torial algorithm with high probability in time O(n^). Consequently, they raised 
the question of whether or not there exist a substantially subcubic combinatorial 
algorithm for Boolean matrix multiplication (the fastest known combinatorial al- 
gorithm for this problem is due to Bash et al. 0 and runs in time 0{n^ / log^ n)). 

In this paper, we give a further evidence that such a combinatorial algo- 
rithm might exists. Namely, we provide a combinatorial algorithm for Boolean 
matrix multiplication which is substantially subcubic in case the rows of the 
first nxn matrix or the columns of the second one are highly clustered, i.e., 
their minimum spanning tree in the Hamming metric has low cost. More exactly, 

^ 0{f{nj) means 0{f{n)poly — logn) and stands for the transposed matrix B. 
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our algorithm runs in time 0{n{n + c)) where c is the minimum of the costs of 
the minimum spanning trees for the rows and the columns, respectively, in the 
Hamming metric. It relies on the fast methods for computing an approximate 
minimum spanning tree in the L\ and L 2 metrics given in |7iS| . 

We also observe that the simple combinatorial algorithm from 0 for the 
Boolean product of two n x n runs in time 0(n(n + r)) if at least one of the 
matrices is r-sparse, i.e., contains at most r entries set to 1. 

If an entry with indices i, j of the Boolean product of two Boolean matri- 
ces A and B is equal to 1 then any index k such that A[i,k] and B\k^j] are 
equal to 1 is a witness to this. Quite recently, Alon and Naor |2| and Galil and 
Margalit jS] have shown that the witnesses for the Boolean matrix product of 
two n X n Boolean matrices (i.e., for all its nonzero entries) can be computed in 
time by repeatedly applying the aforementioned algorithm of Copper- 

smith and Winograd for arithmetic matrix multiplication 0. The combinatorial 
algorithms for Boolean matrix multiplication presented in this paper yield the 
witnesses directly without any extra asymptotic time-cost. 

Our paper is structured as follows. The next section presents known fact 
on approximating minimum spanning tree in the L\ and L 2 metrics. Section 3 
contains our algorithm for fast Boolean matrix multiplication for highly clustered 
data and its analysis. Finally, Section 4 presents the observations on Boolean 
matrix multiplication for sparse matrices. 



2 Approximate MST in the Hamming Metric 

For c > 1 and a finite set S of points in a metric space, an c- approximate 
minimum spanning tree for S' is a spanning tree in the complete weighted graph 
on S, with edge weights equal to the distances between the endpoints, whose 
total weight is at most c times the minimum. 

In I 2 I (section 4.3), Indyk and Motwani in particular considered the bichro- 
matic e-approximate closest pair problem for n points in R‘^ with integer co- 
ordinates in 0(1) under the Lp metric, p G {1,2}. They showed that there is 
a dynamic data structure for this problem which supports insertions, deletions 
and queries in time 0(c?n^/*^^“'"*^)) and requires 0{dn + n^+^/(^“'''^^)-time prepro- 
cessing. In consequence, by a simulation of Kruskal’s algorithm they deduced 
the following fact. 

Fact 1. For e > 0, a 1 -I- e-approximate minimum spanning tree for a set of n 
points in with integer coordinates in 0(1) under the L\ or L 2 metric can be 
computed by a Monte Carlo algorithm in time 

In 0 Indyk, Schmidt and Thorup reported even slightly more efficient (by a 
poly-log factor) reduction of the problem of finding a 1-|- e-approximate minimum 
spanning tree to the dichromatic e-approximate closest pair problem via an easy 
simulation of Prim’s algorithm. 
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3 Boolean Matrix Multiplication via MST 

For an i-th row of A and a j-th column of B, the set of witnesses is the set of all 
indices k in {1, n} such that A[i, fc] = 1 and B[k,j\ = 1. An n x n matrix W 
such that W[i, j] is the set of witnesses of the i-th row of A and the j-th column 
of B is called a witness matrix for the product of A and B. 

The idea of our combinatorial algorithm for witnesses of Boolean matrix 
product is simple. First, we compute an approximate spanning tree of the rows 
of A (or, the columns of B, alternatively) in the Hamming metric. Then, we 
fix a traversal of the tree and precompute the set differences between the sets 
of entries set to one for consecutive neighboring rows in the traversal. Finally, 
for each column of B, we traverse the tree and compute the set of witnesses for 
the cdot product of the traversed row of A with the column of B from that for 
previously traversed row of A and the column of B. 

Algorithm 1 

Input: n X n Boolean matrices A and B; 

Output: witnesses for the Boolean product of A and B. 

1. for i = 1, ..., n do 

Ali ^ {fc|A[t,fc] = 1} 

2. Compute an 0(logn)-approximate minimum weight spanning tree Ta of the 
graph Ga] 

3. Fix a traversal of Ta of length linear in the size of Ta] 

4. fo •<— the number of the row corresponding to the firstly traversed node of 

Ta] 

5. f ^ fo; 

6. while traversing the tree Ta do 
begin 

I i — t] 

i •<— the number of the row of A corresponding to the next node of Ta on the 
traversal; 

Dli^i •<— Ali \ All] 

Dlpi All \ Ali] 

end 

7. for j = 1, ..., n do 

i ^ io 

bF[i, j] ^ the set of witnesses for the i-th row of A and j-th column of B] 
while traversing the tree Ta do 
begin 
I i — t] 

i •<— the number of the row of A corresponding to the next node of Ta on the 
traversal; 

if i has been already visited then go to E; 

W[i,j]^W[l,j]] 
for each k G Db,/ do 
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if B\k, j] = 1 then insert k into W\i, 7I; 

W[i,j]^W[i,j]\Dh,, 

is non-empty then output {i,j,w) where w is an element in j] 

E: end 

Lemma 1. Algorithm 1 is correct, i.e., it outputs witnesses for the produet of 
the Boolean matriees A and B. 

Lemma 2. Algorithm 1 ean be implemented in time 0(n{n + MWT{A))) + t(n) 
where t{n) is the time taken by the eonstruetion of the O {log n)- approximate 
minimum weight spanning tree in step 2. 

Proof. By representing the sets Ali of indices with monotone lists, we can com- 
pute each of the set differences Dlig, Dil i in time 0{n). This combined with 
the linear in n length of the traversal Ta implies an 0(n^)-time implementa- 
tion of steps 1 and 6. To implement step 7 efficiently we use search trees to 
form the sets W[i,j]. Then, the cost of transforming into be- 

comes 0{\Dlij U Dlpi\ logn) and consequently the overall time-cost of step 7 is 
0{n{n + logn^ - \Dlig U D1ia\)) which is 0{n{n + MWT{A))). 

Note that the Li metric for points in i?" with 0, 1-coordinates coincides with 
the n-dimensional Hamming metric. It follows from Fact 1 with e set to O(logn) 
that an 0(log n)-approximate minimum spanning tree for a set of n vectors in the 
n-dimensional Hamming metric can be computed by a Monte Carlo algorithm 
in time O(n^). Also, the transposed product of matrices A and B is equal to 
the product of the transposed matrix B with the transposed matrix A. Hence, 
Lemmata n 121 yield our main result. 

Theorem 1. Let A, B be two n x n Boolean matrices. The product of A with 
B as well as witnesses for the product can be computed in expeeted time 
0{n{n+To.m.{MWT{A), MWT{B*)'\)) where B* stands for the transposed matrix 
B. 

4 The Sparse Case 

It is well known that if one of the two input nxn Boolean matrices to multiply is 
sparse, i.e., has the number r of entries set to 1 substantially smaller than then 
the witness matrix can be computed almost from the definition in substantially 
sub-cubic time. Simply, we can form for each row or column of the sparse matrix 
a list of indices of entries holding ones, and for each column or row of the other 
matrix a search tree for ones in it. Now, to determine the witnesses for the (*, j) 
entry of the output matrix is enough to search the indices in the list of ones in the 
i-th row of the first matrix in the search tree for ones in the j-th column of the 
other matrix or vice versa. Since each search costs 0(log n), this method requires 
0{n{n + r) logn) operations. Note that the results of the previous section yield 
an analogous upper bound in the sparse case up to a logarithmic factor. 
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Interestingly, one can obtain a slightly better result in the sparse case by 
considering the so called column-row and row-column matrix representations 
(cf. 0). 

A column-row (CR for short) representation of a Boolean n x n matrix A is 
a sequence such that Rk{A) is a list of the numbers of the rows of 

A that have 1 on the fc-th position (i.e., in the A:-th column) in the increasing 
order. A row-column (RC for short) Ck(A), k = representation of A is 

defined symmetrically. The following lemma is straightforward. 

Lemma 3. CR and RC representations of a Boolean n x n matrix A ean be 
eonstructed in time 0{nf). 

Theorem 2. Let A, B be two Boolean n x n matrices. If A or B has at most 
r entries set to 1 then the sets of witnesses for the Boolean product of A and B 
can be determined in time 0{n(n -\- r)). 

Proof. To compute the witness matrix W set all its entries initially to en empty 
set. Construct a CR representation Rk{A), k = 1, ...,n, of A and an RC repre- 
sentation Ck{B), k = 1, ..., n, of B. For k = 1, ..., n, all i S Rk{A) and j S Ck{B), 
augment j] by (i,j). The correctness of setting W follows directly from the 
definitions of CR and RC. By Lemma 0 the number of operations performed is 
0(n^ + Sfc=i \Rk{A)\\Ck{B)\) which is -I- nr). 

5 Final Remark 

It follows from the existence of the so called Hadamard matrices 0 that there is 
an infinite sequence of ni x Ui matrices Ai, Bi such that the Hamming distance 
between any pair of rows of Ai or columns of Bi is Cijii). Then, the cost of 
the corresponding minimum spanning trees is 12{{ni)'^). Thus, our combinatorial 
algorithm for Boolean matrix multiplication presented in Section 3 does not 
break the cubic upper bound in the general case. However, in many applications 
of Boolean matrix multiplication where the rows or columns respectively tend 
to be more clustered the aforementioned scenario would be unlikely. 

References 

1. A.V. Aho, J.E. Hopcroft and J.D. Ullman. The Design and Analysis of Computer 
Algorithms (Addison- Wesley, Reading, Massachusetts, 1974). 

2. N. Alon and M. Naor. Derandomization, Witnesses for Boolean Matrix Multipli- 
cation and Construction of Perfect hash functions. Algorithmica 16, pp. 434-449, 
1996. 

3. J. Basch, S. Khanna and R. Motwani. On Diameter Verification and Boolean Ma- 
trix Multiplication. Technical Report, Standford University CS department, 1995. 

4. P.J. Cameron. Combinatorics. Cambridge University Press 1994. 

5. D. Coppersmith and S. Winograd. Matrix Multiplication via Arithmetic Progres- 
sions. J. of Symbolic Computation 9 (1990), pp. 251-280. 



Fast Boolean Matrix Multiplication for Highly Clustered Data 263 



6. Z. Galil and O. Margalit. Witnesses for Boolean Matrix Multiplication and Shortest 
Paths. Journal of Complexity, pp. 417-426, 1993. 

7. P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing 
the Curse of Dimensionality. Proceedings of the 30th ACM Symposium on Theory 
of Computing, 1998. 

8. P. Indyk, S.E. Schmidt, and M. Thorup. On reducing approximate mst to closest 
pair problems in high dimensions. Manuscript, 1999. 

9. C.P. Schnorr and C.R. Subramanian. Almost Optimal (on the average) Combinato- 
rial Algorithms for Boolean Matrix Product Witnesses, Computing the Diameter. 
Randomization and Approximation Techniques in Computer Science. Second In- 
ternational Workshop, RANDOM’98, Lecture Notes in Computer Science 1518, 
pp. 218-231. 




Partitioning Colored Point Sets into 
Monochromatic Parts 



Adrian Dumitrescu^ and Janos Pach^ 



^ University of Wisconsin at Milwaukee, 
ad@cs .uwm.edu 

^ Courant Institute, NYU and Hungarian Academy of Sciences, 
pachOcims .nyu.edu 



Abstract. It is shown that any two-colored set of n points in general po- 
sition in the plane can be partitioned into at most monochromatic 

subsets, whose convex hulls are pairwise disjoint. This bound cannot be 
improved in general. We present an 0(n log n) time algorithm for con- 
structing a partition into fewer parts, if the coloring is unbalanced, i.e., 
the sizes of the two color classes differ by more than one. The analo- 
gous question for fc-colored point sets (fc > 2) and its higher dimensional 
variant are also considered. 



1 Introduction 

A set of points in the plane is said to be in general position, if no three of its 
elements are collinear. Given a two-colored set S oi n points in general position 
in the plane, let p{S) be the minimum number of monochromatic subsets S can 
be partitioned into, such that their convex hulls are pairwise disjoint. Let 

p{n) = max{p(5') : S' C is in general position , |S| = n}. 

The following related quantities were introduced by Ron Aharoni and Michael 
Saks 0. Given a set S of w white and b black points in general position in the 
plane, let g{S) denote the number of edges in a largest non-crossing matching 
of S, where every edge is a straight-line segment connecting two points of the 
same color. Let 

g{n) = min{gi(S) : S C is in general position, |S| = n}. 

Aharoni and Saks asked if it is always possible to match all but a constant 
number of points, i.e., if g{n) > n/2 — 0(1) holds. It was shown in 0 that the 
answer to this question is in the negative. However, according to our next result 
(proved in Section 2), this inequality holds for p{n). In other words: one can 
partition a set of n points in the plane into n/2 + 0(1) monochromatic parts 
whose convex hulls are disjoint; however if the size of each part is restricted to 
at most two points, this is not possible (there are configurations which require 
(1/2 -I- S)n — 0(1) parts, J > 0). 
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Theorem 1. Let p{n) denote the smallest integer p with the property that every 
2-eolored set ofn points in general position in the plane ean be partitioned into p 
monochromatic subsets whose convex hulls are pairwise disjoint. Then we have 



p(n) = 



n+1 

2 



For unbalanced colorings, i.e., when the sizes of the two color classes differ 
by more than one, we prove a stronger result. Slightly abusing notation, for any 
w < b, we write p(tc, b) for the minimum number of monochromatic subsets, 
into which a set of w white and b black points can be partitioned, so that their 
convex hulls are pairwise disjoint. Obviously, we have p(w, b) < p{w + b). 



Theorem 2. For any w < b, we have 



w + 1 < p{w, b) 




+ 1 . 



( 1 ) 



More precisely, 

(i) if w < b < w + 1 or w is even, we have p{w, b) = w + 1; 

(ii) for all odd w > 1 and b > 2w, we have p{w, b) = w + 2; 

(Hi) for all odd w > 3 and w + 2 < b < 2w —1, we have tc + 1 < p{w, b) < w + 2. 



We prove this theorem in Section 3, where we also present an O(nlogn) time 
algorithm which computes a partition meeting the requirements. 

For fc-colored point sets with fc > 3, the functions Pk(n) and gk(n) can be 
defined similarly. In Section 4, we prove 



Theorem 3. For any k,n>3, let pk(n) denote the smallest integer p with the 
property that every k-colored set of n points in general position in the plane 
can be partitioned into p monochromatic subsets whose convex hulls are pairwise 
disjoint. Then we have 





< Pk{n) < 




k + 1/6 



n + 0(1). 



We also provide an 0(n log n) time algorithm for computing such a partition. 

In Section 5, we discuss the analogous problem for k = 2 colors, but in higher 
dimensions. A set of n > d + 1 points in d-space is said to be in general position, 
if no d + 1 of its elements lie in a hyperplane. 

A sequence 0102 ... of integers between 1 and m is called an {m,d)~ 
Davenport-Schinzel sequence, if (i) it has no two consecutive elements which 
are the same, and (ii) it has no alternating subsequence of length d + 2, i.e., 
there are no indices 1 < i\ < i 2 < ... < id +2 such that 



^ii — — Gjg — ... — a, — a^^ — — * ■ • — 

where a ^ b. Let Xd{rn) denote the maximum length of an (m, d)-Davenport- 
Schinzel sequence (see 0, ^). Obviously, we have Ai(m) = m, and it is easy 
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to see that A 2 (m) = 2m — 1, for every m. It was shown by Hart and Sharir 0 
that Xsijn) = 0{na{n)), where a{n) is the extremely slowly growing functional 
inverse of Ackermann’s function, and that this estimate is asymptotically tight. 
They also proved that Xd{m) is only slightly superlinear in m, for every fixed 
d > 3. (For the best currently known bounds of this type, see p.) 

For any fixed d, let 



fJ'di'n) = minjm : Xd{ra) > n}. 



Thus, we have that 



H2{n) 



n + 1 
2 



and fJ-d(n) is only very slightly sublinear in n, for any d > 3. 



Theorem 4. For any n > d >2, let (n) denote the smallest integer p with 
the property that every 2-eolored set of n points in general position in d-spaee 
ean be partitioned into p monoehromatie subsets whose eonvex hulls are pairwise 
disjoint. 

For a fixed d> 2, we have 

(i) pM(n) < § + 0(1); 

(a) p^‘^\n) > pLd{n). 



2 Proof of Theorem Q] 

We prove the lower bound by induction on n. Clearly, we have p(l) > 1, p{2) > 2. 
Assume the inequality holds for all values smaller than n. Consider a set Sa of 
n points placed on a circle, and having alternating colors, white and black (if n 
is odd, there will be two adjacent points of the same color, say, white). We call 
this configuration alternating. Denote by h(n) = p{Sa) the minimum number of 
parts necessary to partition an alternating configuration of n points. Clearly, we 
have p{n) > h{n). 

Assume first that n is even, and fix a partition of Sa. We may suppose without 
loss of generality that this partition has a monochromatic (say, white) part P 
of size I > 2, otherwise the number of parts is n. The set Sa \ P falls into I 
contiguous alternating subsets, each consisting of an odd number, ni, n 2 , . . . , n;, 
of elements, such that 

i—l i—l 

''^ni + l = n, or ^(n^ + 1) = n. 

2=1 2=1 



Hence, by induction. 



i—l i—l , 

h{n) > 1 + XI 1 + X ^ + 9 = 



n + 1 



2 



2 
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as required. If n is odd, let P denote one of the two adjacent white points of Sa- 
Then 

79 1 79 _|_ 1 

h{n) = h{Sa) > h{Sa - {P}) = h{n - 1) > — + 1 = = 



n + 1 
2 



To prove the upper bound, we also use induction on n. Clearly, we have p{l) < 
1, p(2) < 2. Assume the inequality holds for all values smaller than n. If n is 
even. 



p{n) < p{l) +p{n-l)<l + ^ ^ = 1 + ^ 



n + 2 
2 



n + 1 
2 



Let n be odd. Consider a set S of n points, n = w + b, where w and b denote 
the number of white and black points respectively. Suppose, without loss of 
generality, that w > 6 + 1. If conv{S), the convex hull of S, has two adjacent 
vertices of the same color, we can take them as a monochromatic part and apply 
the induction hypothesis to the remaining n—2 points. Therefore, we may assume 
that conv{S) has an even number of vertices, and they are colored white and 
black, alternately. Let a; be a white vertex of conv{S), and consider the points of 
S— {a:} in clockwise order of visibility from x. By writing a 0 for each white point 
that we encounter and a 1 for each black point, we construct a {0, l}-sequence 
of length n — 1. Obviously, this sequence starts and ends with a 1. Removing 
the first and the last elements, we obtain a sequence T = ao«i . . . a „_4 of (even) 
length n — 3. 

The sequence T consists of k = w — 1 zeroes and I = b — 2 ones, where 
k — I = {w — 1) — {b — 2) > 2, k + I = even. According to Claim E below, T has 
a 00 contiguous subsequence (of two adjacent zeroes) starting at an even index 
(0,2,4, ...). Let y and z denote the two white points corresponding to these two 
consecutive zeroes. Partition the set S into the white triple {x, y, z} and two sets 
of odd sizes, rii and U 2 (with rii + U 2 = n — 3), consisting of all points preceding 
and following {y, z} in the clockwise order, respectively. It follows by induction 
that 

n + 1 
2 

It remains to establish 



p{n) < 1 + 



ni + 1 U 2 + f n+1 



Claim 1. Let T = agai . . . be a {0, l}-sequence of even length, consisting 

of k zeroes and I ones, where k — I > 2. Then there is an even index 0 < f < 
k + I — 2 such that Oi = = 0. 



Proof. We prove the claim by induction on fc + /, the length of the string. The 
base case k + I = 2 is clear, because then we have T = 00. For the induction 
step, distinguish four cases, according to what the first two elements of T are: 
00,01,10, or 11. In the first case we are done. In the other three cases, the 
assertion is true, because the inequality k' — V > 2 holds for the sequence T' 
obtained from T by deleting its first two elements. (Here k' and I' denote the 
number of zeroes and ones in T', respectively.) □ 
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3 An Algorithmic Proof of Theorem HI 



The functions p{-) and p{-, •) are monotone increasing: ii i < j then p{i) < p{j), 
and ii i < k and j < I then p{i,j) < p{k,l). Thus, we have the lower bound in 
( 1 ): 



p{w, b) > p{w, w) > h(2w) > 



2w + 1 
2 



= u> + 1. 



Next we prove the upper bound in (1). For re = 1, the inequality holds, so 
assume w > 2. Take an edge xy of the convex hull of the w white points, and 
let I denote its supporting line. Take {x, y} as a monochromatic (white) part 
of size 2, and another (black, possibly empty) monochromatic part by selecting 
all black points lying in the open half-plane bounded by I, which contains no 
white points. Continue this procedure in the other open half-plane bounded by 

1 as long as it contains at least two white points. If there is one (resp., no) white 
point left, then the remaining points can be partitioned into at most three (resp., 
one) monochromatic parts, whose convex hulls are pairwise disjoint. It is easy 
to verify that the number of parts in the resulting partition does not exceed 

2 1"^] -1-1. It is also true that the white parts of this partition form a perfect 
matching (of the white points) if w is even, and an almost perfect one (with one 
isolated point) if w is odd. 

At the end of this section we present an 0{nlogn) time algorithm which 
computes such a partition. 

Further, we verify the three cases highlighted in the theorem. For even w, 
the lower and upper bounds in (1) are both equal to w -I- 1. The same holds in 
case (i), i.e., when w <b < w + 1 : p{w, b) < p(2w -1-1) = w + 1. For odd w, the 
gap between the lower and upper bounds is at most one. 

It remains to discuss case (ii), when w is odd and b > 2w, for which we 
get a tight bound. By the monotonicity of p{w,b), it is enough to show that 
p{w,2w) > w + 2. Let F be a regular w-gon. Let X and Z denote the inner 
and outer polygons obtained from Y by slightly shrinking it and blowing it up, 
respectively, around its center O. To bring X U Y U Z into general position, 
slightly rotate Y around its center in the clockwise direction so that, ii x G X 
and z G Z are the two vertices corresponding to y gY, then the points O, x, y, z 
are almost collinear. Place a white point at each vertex of Y, and a black point 
at each vertex of X and Z. (See Fig. 1.) 

We say that a triangle is white (resp. black) , if all of its vertices are white 
(resp. black). Notice that the white vertices must be partitioned into at least 
{w + l)/2 parts. Otherwise, there would be a white part of size (at least) 3, 
which is impossible, because every white triangle A contains a black point of 
X in its interior. (If the center O is inside A, there are three black points of X 
inside A. If O is outside A, the black point of X corresponding to the “middle” 
point of A lies in A.) Similarly, the vertices of Z require at least (w-l- l)/2 parts. 

Assume, for contradiction, that S can be partitioned into at most w -I- 1 
parts. Then the monochromatic parts restricted to the vertices of Z form an 
almost perfect non-crossing matching using straight-line segments, with exactly 
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Fig. 1. A point set proving the lower bound in Theorem |2| 



one unmatched point, and the same holds for Y . Let zq £ Z he the unmatched 
point of Z. We restrict our attention to the matching M formed by the vertices 
of Z. These segments partition the circumscribing circle C of Z into {w + 1) /2 
convex regions. The boundary B{R) of a region R, is formed by some segments 
in M and arcs of C. Let Zq denote the region containing the center O. Each 
point of X must be assigned to a black part determined by M. We distinguish 
two cases. 

Case 1: zq ^ (see Figure ^ left). Pick any segment s G M fl B{Zq), and 
let z denote its first vertex in the clockwise order. Let x denote the vertex of X 
belonging to the ray Oz. It is easy to see that x cannot be assigned to any of 
the black parts determined by M, since the black triangle obtained in this way 
would contain a white point in its interior, contradiction. 

Case 2: zq G Zq (see Figure ^ right). Let a;o be the vertex of X belonging 
to the ray Ozq. The point xq can only be assigned to the part containing zq. By 
adding the segment XqZq to M, we obtain a non-crossing matching M . Let z be 
the first vertex on B{Zq) following zq in the clockwise order. Let x denote the 
vertex of X belonging to the ray Oz. Similarly to Case 1, it is easy to see that x 
cannot be assigned to any of the black parts determined by M , since any black 
triangle obtained in this way would contain a white point, contradiction. 

Thus, any partition of S requires at least w + 2 parts, which completes the 
proof of the theorem. 

Algorithm outline. We follow the general scheme described in the proof of 
the upper bound in Theorem |2 but allow some variations. Let W and B denote 
the set of white and black points of S, respectively. 

1. This step ignores the black points. First, compute the convex layer decom- 
position of W. Based on this decomposition, find a perfect (or almost perfect) 
non-crossing matching M, using straight-line segments of W. At the same time, 
construct a planar subdivision D of the plane into at most -1-1 convex regions 
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by extending the segments of M along their supporting lines. (If w is odd, draw 
an arbitrary line through the point which is left unmatched by M, and consider 
this point a part of M.) See Claim |21 below for a more precise description of the 
partitioning procedure using the extension of segments. 

2. Here the black points come into play. Preprocess D for point location and 
perform point location in D for each black point. 

3. Output the partition consisting of the pairs of points in M (the unmatched 
point as a part, if w is odd) for the white points, and the groups of points in 
the same region for the black points. 

The following statement is an easy consequence of Euler’s polyhedral formula 
(see also page 259). 

Claim 2. Let S = {si, S 2 , . . . , Sm} be o, family of m pairwise disjoint segments 
in the plane, whose endpoints are in general position. In the given order, extend 
each segment in both directions until it hits another segment or the extension of a 
segment, or to infinity. After completing this process, the plane will be partitioned 
into m + 1 convex regions. 

Algorithm details and analysis. The algorithm outputs at most 2|"^] + 1 
monochromatic parts, \ff\ of which are white and at most |"y] + 1 are black. 
The decomposition into convex layers takes 0(n log n) time |3]. The matching 
M is constructed starting with the outer layer and continues with the next layer, 
etc. If, for example, the size of the outer layer is even, every other segment in 
the layer is included in M. If the size of the outer layer is odd, we proceed in 
the same way, but at the end we compute the tangent from the last unmatched 
point to the convex polygon of the next layer, include this segment in M, and 
proceed to the next layer. The algorithm maintains at each step a current convex 
(possibly unbounded) polygonal region P containing all unmatched points. It 
also maintains the current planar subdivision D computed up to this step. The 
next segment s to be included in M lies inside P. Let s be oriented so that 
all unmatched (white) points are to the right of it. After extending s, P will 
fall into two convex polygonal regions, P and P , containing all the remaining 
unmatched points, and none of the points, respectively. At this point, P becomes 
the new P and P is included in the polygonal subdivision D, which is thus 
refined. An example of step 1 is shown in Figure 0 with the segments numbered 
in their order of inclusion in M . 

For each convex layer, we compute a balanced hierarchical representation (see 
Lemma n below). Since a tangent to a convex polygon of size at most n from an 
exterior point can be computed in O(logn) time, having such a representation, 
the time to compute all tangents is 0(n log n). By dynamically maintaining 
the same (balanced hierarchical) representation for the current polygonal region 
P, we can compute the (at most two) intersection points between P and the 
supporting line of s in O(logn) time. The reader is referred to cn, pages 84-88 
for this representation and the for two statements below. 
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Lemma 1. m, page 85) A balanced hierarchical representation of a convex 
polygon on n points can be computed in 0(n) time. 

Lemma 2. m, page 87) Given a balaneed hierarchical representation of a 
eonvex polygon on n points and a line I, PC\l can be eomputed in O(logn) time. 

One can obtain a balanced hierarchical representation of a convex polygon in 
a natural way, by storing the sequence of edges in the leaves of a balanced tree. 
Furthermore, selecting (2-3)-trees for this representation, allows us to get the 
balanced hierarchical representation of P from that of P in O(logn) time by 
using the SPLIT operation on concatenable queues (see [ 2 |, pages 155-157). To 
get P from P calls for at most two INSERT and at most two SPLIT operations, 
each taking O(logn) time. The update of D with P takes 0{\P |) time. Since 
X) \P I over the execution of the algorithm is 0{n), the total time required by 
step 1 is O(nlogn). 

We recall the following well known fact (e.g. see P21, page 77). 

Lemma 3. For any planar graph with n vertices, one ean build a point loeation 
data strueture ofO{n) size in 0(n log n) time, guaranteeing O(logn) query time. 

The point location is performed for b < n points, so the total time of step 2 
is also 0(n log n). The complexity of the last step is linear, thus the total time 
complexity of the algorithm is O(nlogn). The space requirement is 0(n). 

Another approach. The next algorithm closely mimics the proof of the up- 
per bound in Theorem 0 Using the method of (or of PI) for convex hull 
maintenance under deletions of points, a sequence of n deletions performed on 
an n-point set takes 0(n log n) time. Specifically, maintain the convex hull of the 
white points and that of the black points. Take an edge e of the white hull. Re- 
peatedly, query the black hull to find a black point in the halfplane determined 
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by e containing none of the white points. Such points (if any) are deleted one by 
one, until no other is found, and their collection is output as a black part. Then 
the two endpoints of e are deleted and this pair is output as a white part. The 
above step is repeated until the white points are exausted and the partition is 
complete. The 0{n) queries and deletions take O(nlogn) time. 

4 Coloring with More Colors — Proof of Theorem El 

We prove the lower bound by induction on n. Clearly, we have Pfc(l) > 1, Pfe(2) > 
2 ,..., pk{k) > k. Assume the inequality holds for all values smaller than n. 
Consider a set S of n points placed on a circle, colored with k colors, 1,2, ... ,k, 
in a periodic fashion: 



l,2,...,fc,l,2,...,fc,...,l,2,...,r. 



where n = mk + r, 0 < r < k — 1. We call such a configuration periodic. 
Let hk{n) := pk{S), the smallest number of monochromatic parts a periodic 
configuration S can be partitioned into so that their convex hulls are pairwise 
disjoint. Clearly, we have Pk{n) > hk(n). Fix such a partition. 

Case 1: r = 0. If every monochromatic part of the partition is a singleton, 
then we are done. Consider a monochromatic part P of size I >2, and assume 
without loss of generality that the color of P is fc. S — P splits into smaller 
periodic subsets of sizes ni,ri 2 , . . . ,ni, with = — 1 (mod k) for every i, such 
that 

i—l i—l 

''^^rii + l = n, or ^^(n^ + 1) = n. (2) 

1=1 i=l 

Hence, by induction. 



i—l i—l 

hk{n) > 1 + X! ^k{ni) > 1 + 

1=1 i=l 



(fc — l)rij + 1 

fc 






(fc — l)rij + (fc — 1) 
fc 



i—l 



~ + 1) (fc — l)n (fc — l)n+fc (fc — l)n+l 



Case 2: r = 1. Discard 1 point of color 1 from the last (incomplete) group, 
and get a periodic configuration with n—1 points, as in the previous case. Then, 
by induction. 



hk{n) > hk{n - 1) > 


(fc- 


l)(n-l) + r 
fc 


= 


(fc — l)(n — 1) + fc 
fc 


= 


(fc — l)n + 1 
fc 



Case 3: r > 2. We distinguish two subcases: 

Subcase 3. a: There exists a monochromatic part P of size I > 2 of color 1. 
The set S — P splits into smaller periodic subsets of sizes ni,ri 2 , ■ . ■ ,ni satisfying 
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(2), where rii = —1 (mod k) for all 1 < i < Z — 1, and ni = r — 1 (mod k). 
Then, using the induction hypothesis, we obtain 






i—l 



hk{n) > 1 + ^ hk{rii) > 1 + 



{k — l)n^ + 1 



2=1 



2=1 



2=Z— 1 



i=l 



{k — l)rii + (fc — 1) 
k 



(fc - l)m + (r - 1) 

k 






(fc — l)(ni + 1) r — k (fc — l)n + ; 



k 



{k — l)n + r 



(fc — l)n -1- r — (fc — 1) 




(fc — l)n -I- 1 


fc 




fc 



Subcase 3.b: There is no monochromatic part P of size I > 2 of color 1. 

-{k-2){n-^) + r 



n — r , n — r. n — r 

hk(n) > — h hk-i{n —) > ^ 



= m + 



{k — 2){{k — l)m + r) + 1 

~k^ 



= [k — l)m + 



k-l 

{k-2)r+l 

k-l 



The lower bound we want to prove can be written as 



(fc — l)n + 1 



= (fc — l)m + 



{k — l)r + 1 
k 



An easy calculation shows that 



■(fc- 2)r + r 



1 — r 
fc- 1 



1 — r 



(fc — l)r + 1 



The value of the above expression is 1 for r = 0, and r for r > 1 (only the latter 
case is of interest in this subcase). This concludes the proof of the lower bound. 
Next, we establish the upper bound in Theorem 0 

Proposition 1. pk{n) < n — gkin). 

Proof. A set of n points contains a non-crossing matching of gk(n) pairs. Thus, 
using parts of size 2 for the points in the matching and singletons for the re- 
maining ones, we obtain a partition, in which the number of parts is 



5fc(n) + {n- 2gk{n)) =n- gk{n). 



□ 
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Proposition 2. [3 For fc > 3, we have gk{Qk + 1) = 6. 

It follows immediately from Propositions Q] and Q that 

»(”) < » - s.(») = (l - " + 0(1) = (l - n + 0(1), 

as required. 

In the end, we outline an algorithm. Its main part is the computation of the 
matching M , which appears in \I\. Let A: > 3 be a fixed integer. 

Algorithm. The n points are sorted according to their x-coordinate and divided 
into consecutive groups of size 6fc+ I. In each group, with the possible exception 
of the last one, one can match at least 12 elements. This can be done by matching 
12 out of 19 points in the three largest color classes. The time to process a group 
is 0{k), so the total time is 0(n log n). The number of output parts is bounded 
as in the theorem. 

5 Higher Dimensions — Proof of Theorem SJ 

The upper bound (i) can be shown by straightforward generalization of the proof 
of Theorem El ii w < b, n = w + b, |IT| = w, one can iteratively remove a facet 
of conv{W). The easy details are left to the reader. 

To verify part (ii), we need a simple property of the d-dimensional moment 
curve 

Md(r) = (r,r^, . . . ,r'^), t S R. 

For two points, x = Md{T\) and y = Md{T 2 ), we say that x precedes y and write 
X A 2 / if Ti < T 2 . For simplicity, let u= |"d/2] and v = [d/2j . 

Lemma 4. (see 0, j^) Let xi A X 2 A . . . A x^+i ^.nd yi < y 2 < ■ ■ ■ Vv+i 
be two sequences of distinct points on the d-dimensional moment curve, whose 
convex hulls are denoted by X and Y, respectively. 

Then X and Y cross each other if and only if the points Xi and yj interleave, 
i.e., every arc x^Xi+i of Md^r) contains exactly one yj. 

Let Sa be a sequence of n points on the d-dimensional moment curve, 
xi A X2 A . . . A Xn, which are colored white and black, alternately. Consider a 
partition of Sa into m monochromatic parts, labeled by integers between 1 and 
m, and suppose that the convex hulls of these parts are pairwise disjoint. Replac- 
ing every element of the sequence Sa by the the label of the part containing it, 
we obtain a sequence T of length n, whose elements are integers between 1 and 
m. Obviously, no two consecutive elements of this sequence coincide, because 
adjacent points have different colors, and thus belong to different parts in the 
partition. It follows from the above lemma that T has no alternating subsequence 
of length (m-I-I)-I-(u-I-I) = d-|-2. That is, T is an (m, d)-Davenport-Schinzel se- 
quence of length n. Therefore, we have n < \d{m), or, equivalently, m > y.d{n), 
as required. 
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1 Introduction 

Given a set S' of n data points in some metric space. Given a query point q 
in this space, a nearest neighbor query asks for the nearest point of S to q. 
Throughout we will assume that the space is real d-dimensional space and 
the metric is Euclidean distance. The goal is to preprocess S into a data structure 
so that such queries can be answered efficiently. Nearest neighbor searching has 
applications in many areas, including data mining 0, pattern classification |^, 
data compression uni- 

Because many applications involve large data sets, we are interested in data 
structures that use linear storage space. Naively, nearest neighbor queries can be 
answered in 0(dn) time through brute-force search. Although nearest neighbor 
searching can be performed efficiently in low-dimension spaces, for all known ex- 
act linear-space data structures, search times grow exponentially as a function of 
dimension. Thus for reasonably large dimensions, brute- force search is often the 
most efficient in practice. One approach to reducing the search time is through 
approximate nearest neighbor search. A number of data structures for approxi- 
mate nearest neighbor searching have been proposed mm- The phenomenon 
of concentration of distance would suggest that approximate nearest neighbor 
searching is meaningless. Fortunately, the distributions that arise in applications 
tend to be clustered in lower dimensional subspaces jOI . Good search algorithms 
take advantage of this low-dimensional clustering. 

The fundamental problem that motivates this work is the lack of predictabil- 
ity in existing practical approaches to nearest neighbor searching. In high di- 
mensions, exact search is no better than brute- force, and approximate search 
algorithms are acceptably fast only when the allowed approximation factors are 
set to unreasonably high values Q . The goal of this work is to address this short- 
coming by providing a practical data structure for nearest neighbor searching 
that is both efficient and guarantees predictable performance. 

We measure performance in an aggregate sense, rather than for individual 
queries. We assume that queries are drawn from some known probability dis- 
tribution. The user provides a desired failure probability, pf, and the resulting 

* The support of the National Science Foundation under grant CCR-9712379 is gra- 
tefully acknowledged. 
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search fails to return the true nearest neighbor with probability at most pf. The 
query distribution is presented by a set of training queries as part of the prepro- 
cessing stage. The training data permits us to tailor the data structure to the 
underlying structure of the point distribution. 

The idea of allowing failures in nearest neighbor searching was proposed 
by Ciaccia and Patella |5|, but no performance guarantees were provided. We 
present a data structure called an overlapped split tree, or os-tree for short. The 
tree, which we first introduced in is a generalization of the well known BSP- 
tree, in which each node of the tree is associated with a convex region of space. 
However, unlike the BSP-tree in which the child regions partition the parent’s 
region, here we allow these regions to overlap one another. The degree of overlap 
is determined by the failure probability and the query distribution as represented 
by the training points. 

Based on empirical experiments on both synthetic and real data sets, we have 
shown that compared to the popular kd-tree data structure, it is possible to 
achieve significantly better predictability of failure probabilities while achieving 
running times that are competitive, and sometimes superior to the kd-tree m 
However that paper did not address the efficiency of the data structure except 
for experiments on synthetic data sets. In this paper we present a theoretical 
analysis of the os-tree’s efficiency and performance and provide experimental 
evidence of its efficiency on real application data sets. 

2 Overlapped Split Tree 

The os-tree was introduced by the authors in an earlier paper . We summarize 
the main elements of the data structure here. It is a generalization of the well- 
known binary space partition (BSP) tree (see, e.g., ^). Consider a set S of 
points in d-dimensional space. A BSP-tree for this set of points is based on a 
hierarchical subdivision of space into convex polyhedral cells. Each node in the 
BSP-tree is associated with such a cell and the subset of points lying within this 
cell. The root node is associated with the entire space and the entire point set. 
A cell is split into two disjoint cells by a separating hyperplane, and these two 
cells are then associated with the children of the current node. Data points are 
then distributed among the children according to which side of the separating 
hyperplane they lie. The process is repeated until the number of points associated 
with a node falls below a given threshold, which we assume to be one throughout. 

The os-tree generalizes the BSP-tree by associating each node with an ad- 
ditional convex polyhedral region called a cover. (In [1 2j the term approximate 
cover was used.) Consider a node 5 in the tree, associated with the point subset 
Ss- The cover Cg is constructed to contain a significant fraction of the points of 
SR'* whose nearest neighbor is in Sg, that is, Cg covers a significant fraction of 
the union of the Voronoi cells of the points of Sg. Computing these Voronoi 
cells would be too expensive in high dimensional spaces, and so the covers are 
computed with the aid of a large set of training points T, which is assumed to 
be sampled from the same distribution at the data points, and where \T\ ^ l^”!. 
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For each point in T we compute its nearest neighbor in S. The cover Cs contains 
all the points of T whose nearest neighbor is in Ss- See Fig. ^ As the size of 
T increases, the probability that a point lies outside of Cg but has its nearest 
neighbor in Sg decreases. 




Data points (S) 
Training points (T) 



Fig. 1. The cover for a set of data points (filled circles) contains all the training points 
(shown as white circles) whose nearest neighbor is in this subset. 



As with the BSP tree, covers are not stored explicitly, but rather are defined 
by a set of boundary hyperplanes stored at the nodes of each of the ancestors. 
In typical BSP fashion, the parent cell is split by a hyperplane, which parti- 
tions the point set and cell. We introduce two halfspaces, supported by parallel 
hyperplanes. One covers the Voronoi cells associated with the left side of the 
split and the other covers the Voronoi cells associated with the right side. These 
hyperplanes are stored with the parent cell. Thus, as we descend the tree, the 
intersection of the associated halfspaces implicitly defines the cover. 

The construction algorithm works recursively, starting with the root. The 
initial point set consists of all the data points, and the initial cell and cover are 
Consider a node 6 containing two or more data points. First, we compute a 
separating hyperplane H for the current point set (see Fig. |3). This hyperplane 
is chosen to be orthogonal to the largest eigenvector of the covariance matrix of 
the points of Sg . This is the direction of maximum variance for the points of Sg 
0. The position of the hyperplane is selected so that it bisects the points of Sg. 
Let Si and Sr denote the resulting partition, and define Ti and Tr analogously. 
(This partition is indicated using circles and squares in Fig. 0) 

To determine the orientation of the boundary hyperplanes, we invoke 
support-vector machines (SVM) to the subsets Si and Sr- SVM was developed 
in learning theory to find the hyperplane that best separates two classes 
of points in multidimensional space. SVM finds a splitting hyperplane with the 
highest margin, which is defined to be the sum of the nearest distance from the 
hyperplane to the points in Si and the nearest distance to the points in Sr- After 
SVM determines orientation of the splitting hyperplane, the two boundary hy- 
perplanes Hi and Hr are chosen to be parallel to this hyperplane, such that Hi 
bounds all the points of Ti and Hr bounds all the points of Tr. The hyperplanes 
Hi and Hr are stored in node 6, and the process continues recursively on each 
of the two subsets. 
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Fig. 2. The construction of the boundary hyperplanes {Hi and H^) and the resulting 
partition of the data and training points. 



To answer the nearest neighbor query q, the search starts at the root. For 
any leaf, the distance to the point in this node is returned. At any internal 
node S, we first determine which cover(s) the query lies. This is done in 0{d) 
time by comparing q against each of the boundary hyperplanes. If q lies in only 
one cover, then we recursively search the corresponding subtree and return the 
result. Otherwise, the subtree closest to q (say the left) is searched first, and the 
distance di to the nearest neighbor in this subtree is returned. If the distance 
from q to the other (right) boundary hyperplane is less than di , then this subtree 
is also searched resulting in a distance dr - The minimum of di and dr is returned. 

It is easy to see that this search will fail only if for some node S, the query 
point q lies in the Voronoi cell of some point p £ Ss, but lies outside the associated 
cover. It is not hard to show that if T and S are independent and identically 
distributed, then the probability of such a failure is proportional to |5'|/|T|. (The 
proof is omitted from this version.) By making the training set sufficiently large, 
we can reduce the failure probability below any desired threshold. 

3 Theoretical Analysis 

In this section we explore the expected search time of the os-tree. As the search 
visits each nonleaf node of the os-tree, there are three possibilities: visit the left 
child only, visit the right child only, or visit both children. Suppose that the 
probability that we visit both children at any given node is bounded above by b, 
0 < 5 < 1. Let T{n) denote the expected running time of the search algorithm 
given a subtree with n data points. Then because we do 0{d) computations at 
each node and split the data points evenly at each step, it follows that up to 
constant factors, T{n) is bounded by the following recurrence 

T{n) < 2bT{n/2) + {l-b)T{n/2) + d = {1 + b)T{n/2) + d. 
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Solving the recurrence yields 



Thus the analysis of the expected search reduces to bounding the value of b. 

In general 6 is a function of the dimension d and the point distribution. It 
is not difficult to contrive distributions in which, on any given node, virtually 
all query points visit both subtrees. In such cases the above analysis suggests 
that the running time is no better than brute-force search. Our analysis is based 
on some assumptions on the data distribution, which we believe are reasonable 
models of typical real data sets. 

An important aspect of real data sets in high dimensional spaces is that many 
of the coordinates are correlated and some exhibit very small variances. Hence, 
points tend to be clustered (at least locally) around low dimensional spaces. A 
common way of modeling this is through principal component analysis. Consider 
a fixed node 5 in the tree and the subset Ss of data points associated with this 
node, and let ns be the cardinality of this set. Let Cs denote the corresponding 
cover. Because S will be fixed for the rest of the analysis, we will drop these 
subscripts to simplify the notation. We assume that the data points are sampled 
from a d-dimensional multivariate distribution. Let x = denote a random 

vector from this distribution. Let /x G 3?'^ denote the mean of this distribution 
and let S denote the d x d covariance matrix for the distribution 0, 

S = E{{x - fi){x - fif). 

This is a symmetric positive definite matrix, and hence has d positive eigenvalues. 
We can express this as = U AU'^ , where U is a, dxd orthogonal matrix whose 
columns are the eigenvectors of S and A is a dxd diagonal matrix, whose 

entries {Xi}f^i are the eigenvalues of S. We may assume that the eigenvalues 
are sorted in decreasing order, so that Ai is the largest eigenvalue. By applying 
the transformation y — U'^ {x — fx), we map the points to a reference system 
in which the samples have mean 0, and a diagonal covariance matrix with the 
prescribed eigenvalues. The coordinates are now uncorrelated. We may assume 
henceforth that all point coordinates are given relative to this new reference 
system, the Xi is the distribution variance for the ith coordinate. Henceforth, let 
Ci = VK- 

Given a parameter 7 > 0 define the pseudo-dimension d, to be minimum 
integer such that 1 < d < d and 



E 

d<.i<d 



<r 



Under the assumption that the points are clustered in a low-dimension subspace, 
we would expect that d is small relative d. Let H be the d-dimensional hyper- 
plane spanned by the first d eigenvectors of S. Given a query point q = 
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sampled from the same distribution as the data points, let q denote its orthog- 
onal projection onto H, and let e{q) = ||g — q||, where || • || denotes Euclidean 
distance. It is well known 0 that the expected value of e^{q) is 

E{e^{q)) = 

d<.i<d 

Our analysis is based on the following distribution assumptions. The first 
is that the pseudo-dimension d is significantly smaller that the dimension d 
of the space. The second is a type of Lipschitz condition on the distribution 
function, which states that point densities are bounded relative to the principal 
components. In particular, consider the restriction of the point distribution to 
the first d coordinates. We assume that there exist positive constants ci and C 2 
such that given any point q G C C\ H and for all sufficiently small positive r, 
(1) the probability of a point lying within a ball of radius r centered at q is 
at least {cir/aiY, and (2) for all positive x, the probability that |gi| <x is at 
most cixja\. Note that these conditions are satisfied for a uniform distribution 
assuming that the first d eigenvalues are bounded away from 0. 

Recall that in the construction of the os-tree, we first partition the points 
into equal sizes using a separating hyperplane that is orthogonal to the largest 
eigenvector of the sample covariance matrix. We assume that n is large enough 
that the differences in the sample covariance matrix and the distribution covari- 
ance matrix are negligible. In the os-tree construction, we use SVM to compute 
the best orientation for the boundary hyperplanes. To simplify the analysis, let 
us assume that the boundary hyperplanes are chosen to be orthogonal to the 
largest eigenvector. Due to space limitations the proof has been omitted. It will 
be presented in the full version of the paper. 

Theorem 1. For any 6, 0 < 5 < 1 and for pseudo-dimension d defined by 

51.5 

there is a value Ub (depending on b, ci, ci, and d), such that for any node 5 in the 
os-tree associated with at least Ub points and satisfying distribution assumptions 
stated earlier, the probability that either (1) a random query point q visits both 
children in the os-tree search or (2) the search returns an incorrect result from 
this node is at most b. 

4 Experimental Results 

We used both synthetic and real data sets. Because the os-trees require a rela- 
tively large number of training points, one advantage of synthetic data sets is 
that we can generate training sets of arbitrary size. To emulate real data sets, 
we choose a distribution that is clustered in subspaces having low intrinsic di- 
mensionality, which we call clustered rotated fiats. In this distribution, points 
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are distributed among clusters, whose centers are sampled uniformly from a hy- 
percube [—1,1]'^. Each cluster contains a low dimensional flat. We denote fat 
dimension for each dimension on the flat, and thin dimension for others. The 
fat dimensions are randomly chosen among the coordinates of the full space. 
Points are distributed evenly to each cluster. In fat dimensions, point are drawn 
from uniform distribution in the range of [—1,1]. In thin dimensions, we use 
a Gaussian distribution with a standard deviation of crthin- Each flat is then 
rotated independently. This is done by repeatedly applying the rotation ma- 
trix A to all points in the cluster. A is an identity matrix with four elements 
An = Ajj = cos{9), Aij = —Aji = sin(0), where i and j are randomly cho- 
sen axes and 9 is randomly chosen from — tt/ 2 to tt/2. In the experiments, the 
number of clusters is fixed at 5, the number of fat dimensions is [3d/10j. 

Two real data sets are used in the experiments. The first set is NASA MODIS 
satellite images. MODIS is a sensor aboard a satellite to acquire spatial data in 
36 spectral bands of which 26 were usable. The data from the sensor are archived 
into files according to region of the earth and particular time. Both data and 
query sets contained around 13K points. The second set is NOAA World Ocean 
Atlas This data set contains information about some basic attributes of the 
oceans of the world. Examples of attributes are temperature, salinity, and so on. 
We use 8 attributes in the data set. The data type of all attributes is floating 
point. Both data and query set contained around IIK points. 

4.1 Eigenvalues of Point Sets 

In our analysis, we assumed that if we sort the eigenvalues of the data set 
associated with each node in decreasing order, these values decrease rapidly. 
This assumption is generally valid for most of the synthetic data sets. It is also 
true in both real data sets we use. We recorded all eigenvalues of the point set in 
each node in os-trees during tree construction. These eigenvalues are sorted and 
normalized. We computed these relative eigenvalues averaged over the nodes at 
each level of the os-tree. 

In Fig. 0we show the first, second, third, fifth eigenvalues and every five 
eigenvalues after that. Note that the plot is log-scale along the y-axis. Level 0 
represents the root node, level 1 represents the children of the root node and 
so on. Only the first five eigenvalues are consistently greater than 0.1 in several 
levels in the tree. The rest decrease rapidly. Some of the lower eigenvalues are 
left unplotted at near leaf levels because they are zero. This is because there are 
not enough points in such nodes to span all the dimensions. The results of the 
NOAA ocean data set are quite similar, relatively few (three) eigenvalues out of 
8 eigenvalues are greater than 0.1. 

The results of a synthetic data set are shown in Fig. 0 We generated a data 
set with 32K points in 30 dimensional space and set crthin = 0.05. The number of 
rotations is 30. The clusters in this data set are 9-dimensional flats. The results 
show that there is a significant gap between 9th and 10th eigenvalues. The first 
nine eigenvalues capture the major characteristics of the data set. Observe that 
in the first few levels of the tree each node spans multiple clusters, and hence 
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Average eigenvalues of point sets in each level 
(NASA satellite data set) 




Fig. 3. Eigenvalues of point sets in each level in the tree. NASA satellite data set. 



the rapid reduction in eigenvalues is not as evident as it is in latter levels, where 
clusters are better isolated. 



Average eigenvalues of point sets in each level 
(Clustered rotated flat, 32K points, 30 dim) 




Fig. 4. Eigenvalues of point sets in each level in the tree. Synthetic data set 



4.2 Number of Points in the Overlap Region 

In the second set of experiments, we investigated the fraction of data points of 
each node that fall in the overlap region between the boundary hyperplanes. 
These are points for which the search may visit both children, and hence is 
related to the value of b described in Section 0 
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Fig.0 shows the fraction of points that are in the overlap region from our real 
data sets. Again, they show the average value for nodes at the same level, where 
level 0 is the root. The size of the overlap region depends on the value of desired 
failure probability. To achieve a small failure probability, a large training set is 
needed. Additional points in the training set may widen the overlap region. Con- 
sequently, the overlap region may contain more data points. The fraction usually 
falls between 0.3 and 0.7 in various levels in the tree. Note that near the leaf 
level this fraction drops significantly. This is because there are very few points in 
the node relative to the dimension of the space. Therefore the Voronoi diagram 
can be approximated much better by a single hyperplane. Similar behaviors are 
also observed for NASA satellite data and the synthetic data set that we tested. 



Fraction of points in overlap region 
(NOAA ocean data set) 




Fig. 5. Fraction of points in overlap region. NOAA ocean data sets. 



4.3 Comparison with kd-Tree Search 

We considered the search performance of the os-tree against an approximate ver- 
sion of the the well-known kd-tree data structure |E| The difficulty in comparing 
these data structures is that the search models are different: probably-correct 
search in the os-tree and approximate nearest neighbor search in the kd-tree. To 
produce a realistic comparison, we adjusted the approximation factor (e) of the 
kd-tree so that the resulting failure probability of the kd-tree matches that of 
the os-tree. 

We show the results for the synthetic data sets only. The comparative results 
of the real data sets as well as other synthetic data sets were presented in m- 
We used different parameters from the other experiments. The number of points 
is varied from 2K to 32K, the number of rotations is equal to the dimension, 
o-thin = 0.01. 
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Fig. 0 compares the performance of both trees as we vary the number of 
dimensions, and the number of points. The average query time is used as the 
metric. From the figure, we see that the os-tree is competitive with the kd-tree 
except for high dimensional instances. If the number of leaf nodes visited is used 
as the measure the search performance, the os-tree search visits fewer nodes than 
the kd-tree in almost all cases. The reason for the differences in CPU times is 
largely due to the additional overhead suffered by the os-tree. Through the use 
of incremental distance calculation P, the processing time at each node of the 
kd-tree is independent of dimension, while it is 0{d) for the os-tree. 



Clustered rotated flats 



d rotations 




Fig. 6. Performance comparison. Using query time (sec) as the metric. 
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Abstract. We present I/O-efhcient algorithms to construct planar Stei- 
ner spanners for point sets and sets of polygonal obstacles in the plane, 
and for constructing the “dumbbell” spanner of 0 for point sets in higher 
dimensions. As important ingredients to our algorithms, we present I/O- 
efhcient algorithms to color the vertices of a graph of bounded degree, 
answer binary search queries on topology buffer trees, and preprocess a 
rooted tree for answering prioritized ancestor queries. 

1 Introduction 

Motivation: Geometric spanners are sparse subgraphs of the complete Eu- 

clidean graph over a set of points in They play a key role in efficient algo- 
rithmic solutions for several fundamental geometric problems. Several efficient 
algorithms for constructing spanners in Euclidean space are known, including 
I/O-efRcient algorithms ca, thereby enabling the processing of much bigger 
data sets that do not fit into internal memory. With respect to geometric shortest 
path problems, in internal memory, spanners are useful because they are sparse, 
so that approximate shortest path queries on the complete Euclidean graph, 
whose size is 0{N^), can be answered by solving the single-source shortest path 
(SSSP) problem on a graph of size 0{N). In external memory, sparseness is 
not sufficient to obtain I/O-efhcient algorithms, as the best known single source 
shortest path algorithm takes 0{\V\ + (|E|/B)log 2 |E|) I/Os jl^. The focus of 
this paper is to construct spanners in such a way that spanner paths can be 
reported I/O-efhciently. 

Computational Model and Previous Results: In the Parallel Disk Model 
(PDM) (see C3)i external memory (EM) consisting of D disks is attached 
to a machine with an internal memory of size M . Each of the D disks is di- 
vided into blocks of B consecutive data items. Up to D blocks, at most one per 
disk, can be transferred between internal and external memory in a single I/O- 
operation. The complexity of an algorithm in this model is the number of I/O 
operations it performs. It has been shown that sorting an array of size N takes 

* Research supported by NSERG and NGE GEOIDE. 
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sort(A^) = 0 {{N / DB) f g-^{N / B)) I/Os in the PDM (see jlZI)- Scanning an 

array of size N takes scan(TV) = 0{N/DB) I/Os. Due to the lack of space, we 
are forced to omit a discussion of previous related work, except for the most rel- 
evant results. However, we refer the reader to m for geometric spanners, |5] for 
the well-separated pair decomposition (WSPD), and fHI for external memory 
models and algorithms. 

In an algorithm to compute the WSPD and the corresponding spanner 
of a point set in in 0(sort(iV)) I/Os using 0{N/B) blocks of external mem- 
ory has been proposed. By carefully choosing spanner edges, the diameter of 
the spanner graph is shown to be at most 2 log 2 N. Reporting a spanner path 
in this spanner takes 0(1) I/Os per edge in the path. Moreover, a lower bound 
of f2(min{fV, sort(A^)}) I/Os for computing a t-spanner of a given point set is 
presented. 

New Results: In Sec. El we present an algorithm to construct the dumbbell 
spanner of for a set of N points in in 0(sort(N)) I/Os, show how to 
compute an augmented spanner of size 0{N) and spanner diameter 0(log2 N) 
(resp. 0{a{N))) and how to report spanner paths in these two spanners in 
0{log2 N/{DB)) (resp. 0{a{N))) I/Os. These spanners are induced by a con- 
stant number of rooted trees, called dumbbell trees, so that reporting a spanner 
path reduces to reporting a path in one of these trees. The latter can be done 
I/O-efficiently f3|. To construct dumbbell spanners, we have to solve several 
interesting subproblems, including vertex coloring of graphs of bounded degree, 
answering prioritized ancestor queries, and answering queries on topology buffer 
trees (see Sec. 2). In Sec. 0 we present an external version of the algorithm 
of [SI to construct in 0(sort(N)) I/Os a planar Steiner spanner of size 0{N) and 
spanning ratio 1 -I- e for a given set of N points, or polygonal obstacles with N 
vertices in the plane. Planarity is desirable, as planar graphs can be blocked 
and an I/O-efhcient single source shortest path algorithm for embedded planar 
graphs is known P|. Also, planar graphs can be preprocessed for fast shortest 
path queries ng. 

Preliminaries: A Euclidean graph G = (P, E) is a t-Steiner spanner for the 
complete Euclidean graph £{S) defined on a set S of points in if S' C E and 
for every pair of vertices p,q € S, distc;(p, q) < t-dist 2 (p, q). The vertices in E\S 
are called Steiner points. G is a t-Steiner spanner for the visibility graph V{P) 
defined on a set P of polygonal obstacles with vertex set S, if S C E and no 
edge in G crosses the interior of any obstacle in P, and for every pair of vertices 
p,q € S, dist( 3 (_p, q) <t ■ distv(p)(p, q). We call graph G a t-spanner if E = S. 

Given an axes-parallel box R and a point set S contained in R, a fair split 
of i? is a partition of R into two boxes Ri and i? 2 , each containing at least one 
point in S, using an axes-parallel hyperplane iJ; the distance of H from the two 
sides of R parallel to H has to be at least £/3, where i is the shortest side of the 
bounding box of S. Given a point set S, let R{S) be its bounding box, and R{S) 
be the smallest axes-parallel hypercube containing S and centered at the center 
of R{S). The following recursive procedure defines a fair split tree T(S) for S: If 
[S'! = 1, then T{S) consists of the single node S. Otherwise, we split R{S) into 
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two non-empty boxes R'{Si) and i?'(S' 2 ), using a fair split. R{Si) is defined as an 
axes-parallel box of aspect-ratio at most 3 which is contained in R'(Si), contains 
Si, can be partitioned using a fair split, and such that every side e of R{Si) 
either coincides with the corresponding side e' of R'{Si) or is at distance at least 
/'/3 from e', where I' is the length of the side of R'{Si) perpendicular to e' . Now 
recursively compute trees T{Si) and T{S 2 ) and make their roots children of S. 
Note that every node in T{S) corresponds to a unique subset of S. We identify 
this subset with the node. 

A well-separated pair deeomposition (WSPD) of S consists of a fair split tree 
T{S) and a set of pairs {{Ai,Bx},{A 2 ,B 2 }, ■ ■ ■ ,{Am,Bm}} such that Ai,Bi 
are nodes in T{S), for 1 < i < m, for every pair of points p,q € S, there is 
a unique pair {Ai,Bi} such that p £ Ai and q £ Bi, and for all 1 < i < m, 
dist 2 (p,g) > s ■ dist 2 (a:, y), for all p £ Ai, q £ Bi, and either x,y £ Ai or 
x,y £ Bi. The real number s > 0 is called the separation constant. 

2 Techniques 

Coloring Graphs of Bounded Degree: Given a graph G with N vertices 
and maximal degree bounded by some constant A, the following algorithm colors 
the vertices of G with Z\ -|- 1 colors so that any two adjacent vertices in G 
have different colors: Number the vertices of G in their order of appearance in 
the given vertex list, and direct the edges of G from the vertices with smaller 
numbers to those with larger numbers. We now process the vertices by increasing 
numbers, one at a time. When processing a vertex we color it with the smallest 
color different from the colors of its in-neighbors. This technique can easily be 
realized in 0(sort(7V)) I/Os using the time-forward processing technique |1()I2| . 

Prioritized Ancestor Queries: Given a rooted tree T and an assignment of 
priorities priority(u) £ {0, 1, . . . , /c} to the vertices of T, we want to build a data 
structure V that allows answering queries of the following type in 0(1) I/Os: 
Given a vertex v and an ancestor u of u in T, find the highest vertex priority h on 
the path tt from v to u, and report the first vertex first (u, u) and the last vertex 
last(u, u) on tt with priority h. We call these queries prioritized ancestor queries. 
We show how to find first (u, it). A slight modification of this procedure finds 
last(u,it). We augment T with an artificial root r with priority (r) = fc -I- 1, and 
make r the parent of the original root of T. Let Wi = {v £ T \ priority(u) = i}. 
Then every node v £ Wi, i <k, has an ancestor of higher priority. For every node 
V, let p' {v) be the lowest ancestor it of u in T such that priority (it) > priority (u). 
Then we define a tree T' by making v the child of p'{v). 

Lemma 1. Given a node v £T and an ancestor u of v in T, first (u, it) = u or 
first(i;. It) is a child of the lowest common ancestor of u and v in T' , denoted hy 
LGAt''(i’, It). 

We compute a binary tree T' with \T'\ = 0(|r|) from T' as follows: Replace 
every node v with children wi, . . . ,Wt by a path v\,. . . ,Vt of new nodes, such 
that Vi+i is the right child of Vi, for 1 < i < t. We call v\ the representative 



290 



A. Maheshwari, M. Smid, and N. Zeh 



rep(z;) of v and Vt its anchor anchor(i;). Let the children w\, . . . ,Wt be sorted by 
decreasing depth in T. Then we make rep(wi) the left child of Vi, for 1 < i <t 
and give node Vi a label left(r;i) = Wi. The following result follows from Lemma ^ 
and gives an 0(1) I/O procedure to report first(u,u). 

Lemma 2. Given a node v and an ancestor u of v in T, first(u,rt) G {u,z}, 
where z = left(LCA 7 -/(anchor(u),anchor(M))). 

We compute the parents of the nodes in T' for every set Wi separately. Let 
To = T. Then we mark every vertex w € Tq with priority(rt;) > 0. For every node, 
we compute its lowest marked ancestor in Tq. This produces all parents p'{v), 
V G Wq. We remove all vertices in Wq from Tq and make every vertex w ^ Wq 
the child of its lowest marked ancestor in Tq. Let T\ denote the resulting tree. We 
now recursively apply this procedure to T\ to obtain all parents p'{v), for v ^ Wq. 
Using time-forward processing, each recursive step takes 0(sort(|Ti|)) I/Os. Once 
T' has been computed, it takes 0(sort(|T'|)) I/Os to compute T' from T' and 
to preprocess T' for answering LCA-queries in 0(1) I/Os. If \Wi\ < c|Wi_i|, for 
some constant 0 < c < 1 and 1 < z < fc, we obtain the following result. 

Theorem 1. Given a rooted tree T with vertex priorities priority(u) G 
{0, 1, . . . , fc}, let Wi = {v G T : priority(u) =i}, 0 <i <k. //|Wi| < c|Wi_i|, 
for some constant 0 < c < 1 and 1 < i < k, it takes 0(sort(iV)) I/Os and 
0{N/B) bloeks of external memory to eonstruet a data structure T> that allows 
answering prioritized ancestor queries on T in 0(1) I/Os. 

Querying Topology Buffer Trees: Given a binary tree T whose nodes store 
0(1) information each, we call a binary search query q strongly local on T if the 
information stored at a node w and an ancestor u of w is sufficient to decide 
whether all, none, or some of the answers to q in T(v) are stored in T{w), and 
we are required to report all answers to q stored in T. We call q weakly local 
on T if the information stored at a node v is sufficient to decide whether T(v) 
contains an answer to q, and we have to report one answer to q. 

A topology tree T m is a balanced representation of a possibly unbalanced 
binary tree T. T has height 0(log2 N), where N is the number of nodes in T, and 
allows answering weakly local binary search queries on T in 0(log2 N) time. To 
construct T, one starts with a tree Tq = T, and recursively constructs a sequence 
To,Ti, . . . ,Tfe of binary trees, where T^+i is obtained from Ti by contracting a 
carefully chosen set of edges in Ti. The vertex set of T is the disjoint union of 
the vertex sets of trees Tg, Ti, . . . , T^. A vertex v in T^+i is the parent of a vertex 
w in Ti if V is the result of contracting an edge {it, w}, or u is a copy of w in 
Ti+i and no edge (it, w} in Ti has been contracted. 

If T is static, we can extend the idea of [Z] to obtain a topology buffer tree. 
We construct a topology tree T for T and cut it into layers of height log 2 (M/T). 
Each layer is a collection of rooted trees. We contract each such tree into a single 
node. The resulting tree B is the topology buffer tree corresponding to T. B has 
height N); each node of B represents a subtree of T of size 0{M/B) 

and has at most M/B children. Thus, every node of B fits into internal memory. 
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Combining the ideas of topology B-tiees Q and buffer trees |2, we obtain the 
following result. 

Theorem 2. Given a topology buffer tree B representing an N node binary tree 
T and 0{N) (weakly or strongly) loeal binary search queries, it takes 0(sort(iV + 
K)) I/Os and 0{{N + K)/B) blocks of external memory to answer all queries, 
where K is the size of the output, provided that M > B^. 



3 Spanners of Low Diameter 

Given a point set S in we want to construct linear size spanner graphs 
of spanner diameters 0(log2 N) and 0{a{N)) that can be represented by data 
structures to report spanner paths in 0{log2 N/{DB)) and 0{a{N)) I/Os, re- 
spectively. 



3.1 Dumbbell Trees 

Given a WSPD V for S, let T{S) be the fair split tree of V, and C = R{S). We 
refer to the well-separated pairs of T> as dumbbells (as they look like dumbbells if 
we connect the two centers of the bounding boxes of each well-separated pair by 
a straight line segment). We define the length of a dumbbell to be the length of 
this line segment. Refer to the two bounding boxes as the heads of the dumbbell. 
Also, refer to C as a head (which does not belong to any dumbbell). We want to 
partition the set of dumbbells into a constant number of groups such that the 
lengths of two dumbbells in the same group differ by a factor of at most 2 or by 
a factor of at least 1/5, for some 0 < <5 < | to be defined later; the heads of two 
dumbbells in the same group whose lengths differ by a factor of at most two are 
required to have distance at least cl from each other, where c is a constant to be 
specified later, and I is the length of the shorter dumbbell. We call the former 
the length grouping property; the latter the separation property. 

For every such group Q of dumbbells we define a dumbbell tree Tg as follows: 
Tg contains one dumbbell node per dumbbell in Q, one head node per dumbbell 
head of the dumbbells in Q, and one node per point in S. The points in S are 
the leaves of Tg . The head node corresponding to the special head C is the root 
of Tg. For every dumbbell {A,B}, heads A and B are the children of {A,B}. 
The parent of dumbbell {A, B} is the smallest head node containing one of its 
heads. Thus, every node, except C, has a well-defined parent. Such a tree can be 
computed in 0(sort(fV)) I/Os per group by marking all nodes in the fair split 
tree corresponding to dumbbell heads in the group, making every leaf of the fair 
split tree a child of its lowest marked ancestor, and making every dumbbell node 
the child of the lowest marked ancestor of one of its heads. Leaves and dumbbell 
nodes without marked ancestors are children of the head node C. 

In order to compute groups Q with the above properties, we first compute 
0(1) groups having the length-grouping property and then refine these groups to 
ensure the separation property. The length grouping property can be guaranteed 



292 



A. Maheshwari, M. Smid, and N. Zeh 



by simulating the algorithm of 0 in external memory, which takes 0(sort(A^)) 
I/Os. In particular, we compute a number of groups Gij, where 0 < j < 6, 
for some constant b, such that each group Gj = Ui Sij has the length grouping 
property and the dumbbells in each group Gi,j differ by a factor of at most two in 
length. To guarantee the separation property, we partition each group Gij into 
0(1) subgroups Gij^k such that the dumbbells in each subgroup satisfy the sep- 
aration property. We merge groups Gij,k into 0(1) groups Gj fc = Ui Si,j,k, each 
having the length grouping and separation properties. Consider one particular 
group Gi,j, and let £ be the length of the shortest dumbbell in Gij- 

In order to compute groups Gi,j,k, we need to modify the dumbbells in T> 
slightly. Consider a dumbbell D = {A, B}, and let D' = {A\B} be its parent in 
the computation tree0 Then A! has been split into two boxes A\ and Ai, where 
A is contained in Ai. In the following, we will consider {A,B} to be dumbbell 
{Ai,B}. It follows from the properties of a fair split tree and its WSPD that 
the shortest side of head Ai has length at least I' = _|_i 2 )^ ’ 

The core of our algorithm is the construction of a proximity graph V contain- 
ing one vertex per dumbbell in Gi,j and an edge between two vertices if the two 
corresponding dumbbells are too close. We do this as follows: For every dumbbell 
{A, B} S Gi,j, put a box B of side length (c-|-8/s-|-4)£ around the center of head 
A. Then every dumbbell {E^F} G Gij that is too close to {A,B} must have 
both its heads within this box. Partition B into 0(1) grid cells of side length 
£'/2. Then head Ei must contain at least one of the grid vertices because it has 
side length at least £' . Thus, if p is a grid point generated by dumbbell {A,B}, 
and {E,F} is a dumbbell whose enlarged head Ei contains p, we add an edge 
between the vertices corresponding to {A,B} and {E,F} to V. Next we show 
how to find all dumbbell heads Ei containing a grid point p. 

The set of dumbbell heads containing a point p are stored along a path in 
the fair split tree T. Only a constant number of them can be heads of dumbbells 
in Gij, as the minimal side length of a dumbbell head in Gij is at most 2£/s, the 
minimal side length of the parent of a dumbbell head in Gij is at least £' , and the 
side lengths of the boxes along a root-to-leaf path in the fair split tree decrease 
by a factor of 2/3 every d steps. For every grid point p, we report all these heads 
using strongly local binary search on T. The total number of heads reported for 
all grid points and all dumbbells is 0{N). It takes sorting and scanning to find 
the dumbbells in Gij having the reported heads and to add the corresponding 
edges to P. It follows from standard packing arguments that P has bounded 
degree, so that we can compute a vertex coloring of P with a constant number 
of colors. The resulting color classes are the desired subgroups Gi,j,k of Gj,k- 

Lemma 3. Given a point set S in and a WSPD T> for S, it takes 0{sort{N)) 
I/Os and 0{N / B) bloeks of external memory to partition the dumbbells ofP into 
0(1) groups, each having the length grouping and separation properties. Each 
group can be represented by a dumbbell tree. The construction of all dumbbell 
trees takes 0{sort{N)) I/Os and 0{N/B) blocks of external memory. 

^ See |2| for the definition of computation trees. Intuitively, A' is the parent of A in 
the fair split tree, {A' , B} is not well-separated, and A' is larger than B. 
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3.2 Spanners of Logarithmic Diameter 

Let Ti, . . . ,Tr be the dumbbell trees constructed in the previous section. Then 
we construct graphs Gi, . . . , Gr, each having vertex set S, from those trees. We 
merge all these graphs Gi,. . . ,Gr into a spanner G. 

We construct Gi as follows: For every node v G T^, let lo{v) = \Ti{v)\. We 
choose a representative point r{v), for every node u G If u is a leaf, then r{v) 
is the point represented by v. Otherwise, let w\, . . . ,Wk be the children of v in 
Ti. Then r{v) = r{wj), where oj{wj) = max{w(w?i) '■ 1 < h < k}. We add an 
edge {r(u),r(w)} to Gi, for every edge {u,w} in Ti with r{v) ^ r(w). 

For two points p,q G S, let {A, B} be the unique dumbbell such that p G A 
and q G B. Let Ti be the dumbbell tree containing the dumbbell node corre- 
sponding to {A, B}, and let v and w be the two leaves of Ti such that p = r{v) 
and q = r{w). There is a unique path tt = {v = Vq,Vi, . . . ,Vk = w) from v to 
w in Ti. This path corresponds to a path tt = {p = r(vo),r(vi), . . . ,r(vk) = q) 
in Gi . It is shown in | E | that tt has length at most t ■ dist 2 ( p , q) if we choose 
s = 0{d/{t — 1)), 6 = 1/s, and c = 2/6 in the construction of the dumbbell 
trees. Thus, graph G is a t-spanner. Moreover, once we know tree Ti such that 
Gi contains the spanner path fr as constructed above, we can easily report tt by 
traversing the paths from v and w to their LCA in Ti; but tt may be much longer 
than TT because many nodes along tt may have the same representative. Observe, 
however, that all nodes in Ti with the same representative r(v) form a path from 
the leaf £ with r{£) = r(v) to some ancestor of £ in Ti. We construct a tree T' by 
compressing all non-leaf nodes on such a path into a single node. It follows from 
the choice of representatives in Ti that tree T' has height at most log 2 -I- 1, 
so that we can report tt in 0(log2 N/{DB)) I/Os ini. Unfortunately, we do not 
know which of the dumbbell trees contains the dumbbell node corresponding to 
{A,B}. However, as there are only 0(1) dumbbell trees, we can afford to query 
all dumbbell trees and report the shortest path found. 

Theorem 3. It takes 0(sort(fV)) I/Os and 0{N/B) blocks of external memory 
to construct a t-spanner of spanner diameter 0(log2 N) and size 0{N) for a 
given set S of N points in along with a data structure using 0{N/B) blocks 
of external memory that allows reporting a t-spanner path with 0(log2 N) edges 
between any two query points in 0(log2 N/{DB)) I/Os. 



3.3 Spanners of Nearly Constant Diameter 

Next we present an I/O-efficient algorithm to reduce the spanner diameter of 
all graphs Gi to 0{a{N)) and to construct a data structure that allows spanner 
paths with 0{a{N)) edges to be reported at a cost of 0(1) I/Os per edge. The 
construction is based on nni. The idea is to augment every dumbbell tree T' 
with additional edges between nodes and subsets of their ancestors, so that the 
shortest monotone path from a node v to one of its ancestors contains 0{a{N)) 
edges, where a path is monotone if its nodes appear in the same order along a 
leaf-to-root path in Ti. Let T° be the resulting graph and G° be the supergraph 
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of Gi containing edges {r(n),r(w)}, {v,w} € T°. For a node v and an ancestor 
u of V, let 7 T be the path from t; to w, in T/, and 7 t° be the shortest monotone 
path from z; to m in T°. Let tt and 7 r° be the corresponding spanner paths in Gi 
and G°. Then 7 t° is no longer than tt, by the triangle inequality. 

We need some definitions: Given a function / : Nq — t Nq such that /(O) = 0 
and f{x) < X, for a: > 0, we define f‘'^\x) = x and f^'’\x) = f{f^^~^Hx)), for 
i > 0. Then f*{x) = min{A: > 0 : f^^\x) < 1}. We define a series of functions 4>i, 
where (j)o(x) = \_-\/x\ and (j)i{x) = (j)*_-^{x), for z > 0. For a forest F and a set W 
of vertices in F, let F fl W be the forest obtained by contracting every maximal 
subtree whose non-root nodes are not in W to a single node. Let F -kW he the 
set of edges containing edges between every vertex v'vciW and all its ancestors 
and descendants u in F \ W so that the path from u to v does not contain a 
vertex in W \ {w}. Denote the irrefiexive transitive closure of a DAG G by G'*'. 
Given two parameters 0 < A: < a:, let V{x, k) be the set of vertices in F at levels 
k,k + x,k + 2x, We call V{x,0), . . . ,V{x,x — 1) the x-strided levels of F. 

The algorithm of m consists of two parts. The top-level procedure Short- 
cut computes the 13-strided level W of forest F which has minimum size, out- 
puts edges F kW to be added to F, and then recursively shortcuts F fl W, call- 
ing procedure RecShortcut with parameter /3 = min{fc > 0 : 4>k{h{F)) < 4}, 
where h{F) is the maximum height of any tree in F. Procedure RecShortcut 
computes two parameters xi = 4>f3-i{h{F)) and X 2 = 3, and the minimum x\- 
strided level Vi of F. If /3 = 1, it outputs the edges in (F fl Fi)'*" U (F * Vi) 
to be added to F and recursively calls RecShortcut(F \ Vi, 1). If /? > 1, let 
Fl = F n Vi, V 2 be the minimum X 2 -strided level of Fi, and F 2 = Fi fl V 2 ; then 
RecShortCut returns the edges in (F * Vi) U Fi U F 2 to be added to F and 
recursively shortcuts F \Vi and F 2 by invoking RecShortcut(F \ Vi,/ 3) and 
RecShortcut(F2,/3 - 1). 

It is shown in m that the graph T° produced by augmenting a rooted tree 
T with the output of Shortcut(F) has size 0(|F|). The shortest monotone 
path in T° from a node v to any of its ancestors u in F contains 0(o;(|F|)) 
edges. A data structure to find the shortest monotone path in T° between two 
query vertices v and u can be derived quite naturally from the computation 
of algorithm Shortcut. In particular, denote the input forest to Shortcut by 
F = F 2 / 3 + 1 , and consider the tree TZ of recursive calls to RecShortcut triggered 
by Shortcut. Then for every /3, the recursive invocations with parameter /3 form 
a set of paths in TZ. For each such path pi, let li be the topmost node in TZ. 
Then we define a forest F 2/3 as the union of the input forests to all such top-level 
invocations li. We define another forest F 2 P -1 as the union of forests Fi in the 
description of RecShortcut for invocations with parameter /3. If /3 = 1, then 
forest F 2 / 3-1 does not exist. Thus, we obtain a sequence F 2 / 3 + 1 , . . . , F 2 of forests. 

Given a forest F which is the input to invocation F, let Ji, . . . , Jk be the 
descendants of I in 7?. that represent invocations of RecShortcut with pa- 
rameter fi. That is, X = Ji represents RecShortcut(F, /3), J 2 represents 
RecShortcut(F \ Vl,/ 3), and so on. Then every node v oi F appears in the 
set Vl for at most one invocation Ji. We give v priority i. For forests F 2 / 3 _i, we 
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give nodes in F\ n V 2 priority 1. For forest F 2 / 3 + 1 , all nodes in W get priority 1. 
All nodes with no priority label are assumed to have infinite priority. 

Given this labeling, a shortest monotone path from a node v = f 2 / 3 +i to 
its ancestor u = M 2 / 3+1 can be found as follows: Since F = F 2 / 3 + 1 , we start by 
examining Given two nodes and whose shortest monotone path in 

F 2 UF 3 U - • -UFly we want to find, we find the minimum priority i such that there 
is a vertex on the path from to in Fly with priority i, and report the lowest 
and highest such vertices v^-\ and u^-i on this path. There have to be edges 
{vj,v^-i\ and {u-y-i^u-y} in T° . If 7 = 2, there is also an edge 
in T° . Otherwise, we recursively find the shortest monotone path from to 
u-y-i in F 2 0 F 3 U • • • U Fly- 1 . If there is no vertex with finite priority on the path 
from riy to Uy in Fly, this path must be short. We report this path by traversing 
Fry and do not recurse. 

Finding vertices fy_i and Uy_i in forest Fly for two query vertices fy and Uy is 
a prioritized ancestor query; we just report minimum priority ancestors instead 
of maximum priority ancestors. Thus, given data structures IF 2 / 3 + 1 , . . . ,F ~2 to 
answer prioritized ancestor queries on Fjj/j+i, . . . , F^ 2 , the above shortest path 
procedure takes 0(1) I/Os per step along the shortest path from v to u in T°, 
0{a{N)) I/Os in total. It remains to show that these data structures can be 
built I/O-efficiently. The following lemma is crucial for this. 

Lemma 4. Given a forest Fry, 2 < 7 < 2/3+ 1, and a partitioning of the vertices 
of Fry into subsets W\, . . . ,Wk such that Wi contains all vertices in Fy having 
priority i, 1 < / < fc, \Wi\ < for 1 <i < k. 

Forests F2/3+1, • • • , F2 are easily obtained by simulating the computation of 
procedures Shortcut and RecShortcut. By Theorem Q and Lemma 0 it 
takes 0(sort(iV)) I/Os to compute data structures F2/3+1, . . . , F2 from forests 
+2/3+1, ■ • ■ , +2, as the total size of the forests +2/3+1, • ■ • , +2 is 0{N) |T^ . 

Theorem 4. It takes 0(sort(iV)) I/Os and 0{N/B) blocks of external memory 
to construct a t-spanner of spanner diameter 0{a{N)) and size 0{N) for a 
given set S of N points in along with a data structure using 0{N/B) blocks 
of external memory that allows reporting a t-spanner path with 0{a{N)) edges 
between any two query points in 0{a{N)) I/Os. 



4 Planar Steiner Spanners 

Given a set P of simple polygonal obstacles with vertex set S in the plane, we 
want to construct a planar Steiner spanner G of size OdS”!) and spanning ratio 
1 + e for the visibility graph V(+) of P. Our algorithm follows the framework of 
0. It constructs a planar subdivision based on the position of the vertices in S 
and then combines this subdivision with the subdivision defined by the obstacle 
edges to obtain an +i-Steiner spanner. A planar Euclidean Steiner spanner is 
computed by superimposing a constant number of planar +i-Steiner spanners. 



296 



A. Maheshwari, M. Smid, and N. Zeh 



4.1 Planar Xi-Steiner Spanners for Point Sets 

We make frequent use of a procedure intervals, r) that partitions the segment s 
into subsegments of length r each by adding Steiner vertices on s. The following 
planar subdivision D' of a minimal axes-parallel square C containing all points 
of S is the basis for our spanner construction. The cells of D' are of two types. 
Let a box be an axes-parallel rectangle of aspect ratio at most 3. A box cell 
is a box and contains exactly one point of S. A donut cell is the set-theoretic 
difference of two boxes B and B' , does not contain any point of S, and for every 
side e of B, the distance to the corresponding side e' of B' is either zero or at 
least ||e'||/6. 

Given such a subdivision D' , construct a planar Li-Steiner spanner D" for S 
as follows: Perform interval(e, 7^), for every edge e of D' , where ^ is the length of 
the shortest edge of the box to which e belongs, and 0 < 7 < 1 is an appropriately 
chosen constant to be defined later. For every cell R and every boundary edge 
e of i? shoot rays orthogonal to e from the endpoints of e and from the Steiner 
vertices on e toward the interior of R until they meet another edge. For every 
box cell R containing a point p G S, we also shoot rays from p in all four axes- 
parallel directions until they meet the boundary of R. To preserve the planarity 
of the resulting graph, we introduce all intersection points between such rays as 
Steiner vertices. The following lemma now follows from ^ and m 

Lemma 5. Given a set S of N points in the plane and a linear size subdivision 
D' as above, it takes 0(sort(iV) -|- scan(A^/7^)) I/Os to construct a planar Li~ 
Steiner spanner of size 0{N and with spanning ratio 1 -I- 67 for S. 

Subdivision D' is quite naturally derived from a fair split tree T for S. The 
rectangles R{v) associated with the leaves of T are the box cells of D' . These 
box cells cover almost all the square C containing all points in S. The uncovered 
parts of C can be covered by regions R' {v) \ R{v), where R/v) was shrunk to 
R{v) before splitting R{v). We include these regions as the donut cells of D'. 
Using the fair split tree construction of and choosing 7 = e/6, we obtain 
the following result. 

Theorem 5. Given a set S of N points in the plane and a constant e > 0, it 
takes 0{sort{N) + scan(A^/e^)) I/Os to construct a planar Li-Steiner spanner 
of size 0{N/e^) and with spanning ratio 1 -I- e. 

4.2 Planar Steiner Spanners among Polygonal Obstacles 

First we construct a planar Li-Steiner spanner for a given set P of polygonal 
obstacles with vertex set S in the plane. We construct the subdivision D' w.r.t. 
set S and combine it in an appropriate manner with the graph defined by the 
obstacles in P to obtain a subdivision D 2 ■ The spanner is then constructed from 
Z?2 in a manner similar to the construction of D" . Our algorithm to construct D 2 
is based on 0. However, we use only one (a, b)-tree to represent the sweep line 
status instead of using two balanced binary trees. This simplification is crucial 
to allow an I/O-efficient implementation of this procedure. 
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Let Di be the superimposition of subdivision D' = {S\E'), viewed as a 
graph, and the graph D = {S, E) defined by the set of obstacles in P. That is, 
the vertex set of Di contains all vertices in S U S' and all intersection points 
between edges in E' and E. The edge set of Di is obtained by splitting the edges 
in E'UE at their intersection points with other edges. Di may have size f 2 {N'^). 
That is why we base the construction of an Li-spanner for P on a linear size 
subgraph D2 of Di, which we construct without constructing Di first. 

We divide the regions of Di into two classes: A red region is a quadrilateral 
none of whose vertices is in S' U S". The remaining regions are blue regions. Let 
the red graph of D\ be the subgraph of the dual of D\ containing a vertex for 
every red region of D\ and an edge between two vertices if the two corresponding 
regions share an edge that is part of the boundary of a box or donut cell. The 
connected components of the red graph are paths. The red regions along such a 
path are bounded by the same two obstacle edges and a set of edges in E' . We 
call such a set of red regions a ladder. The two obstacle edges on their boundaries 
are the sides of the ladder; the edges from E' are its rungs. Call the topmost 
horizontal rung of a ladder its top rung] we define left, right, and bottom rungs 
in a similar manner. All of these four types of rungs are called extremal rungs. 
We call a ladder trivial if it consists only of a single red region. Otherwise, it is 
non-trivial. Subdivision D2 is obtained from Di by replacing every ladder in Di 
by a single region. It is shown in 0 that D2 has size 0 {N). Using arguments 
from and a construction similar to that of Sec. O we obtain the following 
result. 

Lemma 6. Given a subdivision D2 as defined above, it takes 0 (sort(A^) + 
scan(A^/7^)) I/Os to eonstruet a planar Li-Steiner spanner of size 0 {N/^'^) and 
spanning ratio 1 + 67 for a given set P of polygonal obstaeles with N vertices. 

In order to construct D2, we use four plane sweeps to compute potential top, 
bottom, left, and right rungs. The resulting subdivision P3 may still contain 
non-trivial ladders. But its size is 0 {N), so that we can afford to construct 
I?3 explicitly and remove all non-extremal rungs to obtain D2. We describe the 
construction of potential top rungs. 

Let be the set of horizontal edges in E' . We use a bottom-up sweep to 
compute all potential top rungs. During the sweep, we maintain a set of intervals 
defined by intersections between the sweep line I and obstacle edges. In partic- 
ular, let Cl, . . . ,Cfc be the edges in E intersected by the sweep line from left to 
right. Then the intervals currently stored for I are (ei, 62), (62, 63), . . . , (efc_i, e^). 
An interval / = {l,r) in this list is a ladder interval if there is a ladder between 
I and r, and we have already found at least one horizontal rung of that ladder. 
Otherwise, it is a non-ladder interval. 

We start the sweep with a single non-ladder interval defined by the left and 
right boundaries of the square C containing the whole vertex set S. Event points 
of the sweep are the ^-coordinates of edges in and endpoints of obstacle 
edges. We perform the following updates, depending on the type of event point. 
Let Ii, . . . ,Ik be the current set of intervals defined by the sweep line. When 
we encounter a horizontal edge e G E'^, we find intervals Ii and A containing 
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the left and right endpoints of e. Intervals /;+i, . . . , Jr_i now become ladder 
intervals with their current top rungs set to e. Intervals /; and become non- 
ladder intervals. If or was classified as a ladder interval before £ passed e, 
we output the top rung for Ii or Ir, respectively. Similar procedures are applied 
to update the interval list when the sweep line passes an obstacle vertex. 

We use a buffer tree |2| to represent the sweep line status. In particular, we 
store the current set of intervals sorted from left to right in this treefl Every 
node in the buffer tree stores a time-stamped tag classifying all intervals stored 
in this subtree as ladder or non-ladder intervals and describing the top rung in 
the case of a ladder interval. These tags are chosen so that for every interval, at 
any time, the most recent tag along the path from the root of the tree to the leaf 
storing the interval represents the type and top rung of the interval correctly. 

When the sweep passes a horizontal edge e, we search for the leaves k and 
Ir of T storing 7/ and Ir- Denote the paths from the LCA of li and Ir in T to 
li and Ir by pi and Pr- For all right siblings of nodes on pi, we store that their 
descendants store ladder intervals with top rung e. We do the same for the left 
siblings of nodes on pr- As this is the most recent information added to T, all 
intervals between Ii and Ir are now tagged as ladder intervals with top rung e. 
Intervals 7/ and Ir are being tagged as non-ladder intervals. It is easy to find the 
most recent tags for 7; and 7^ on the way down pi and Pr-, so that the top rungs 
for Ii and Ir are output if necessary. The procedures to update the tree when 
the sweep line passes an endpoint of an obstacle edge are similar. 

Theorem 6. Given a set of polygonal obstacles in the plane with N vertices 
in total, a planar Li-Steiner spanner of spanning ratio 1 + e and size 0{N/e^) 
can be computed in 0{sort{N) + scan(7V/e^)) I/Os using 0{N/{e‘^B)) blocks of 
external memory. 

Combining the final step of the algorithm of 0 with the red-blue line intersection 
algorithm of U, we obtain the following corollary. 

Corollary 1. Given a set of polygonal obstacles in the plane with N vertices in 
total, a planar Euclidean Steiner spanner of spanning ratio 1+e and size 0{N/e'^) 
can be computed in 0(sort(iV/e^) -|- scan(7V/e^)) I/Os using 0{N/{e'^B)) blocks 
of external memory. 
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Abstract. We present a first exact study on higher-dimensional packing 
problems with order constraints. Problems of this type occur naturally 
in applications such as logistics or computer architecture and can be in- 
terpreted as higher-dimensional generalizations of scheduling problems. 
Using graph-theoretic structures to describe feasible solutions, we de- 
velop a novel exact branch-and-bound algorithm. This extends previous 
work by Fekete and Schepers; a key tool is a new order-theoretic charac- 
terization of feasible extensions of a partial order to a given comparability 
graph that is tailor-made for use in a branch-and-bound environment. 
The usefulness of our approach is validated by computational results. 



1 Introduction 

Scheduling and Packing Problems. Scheduling is arguably one of the most im- 
portant topics in combinatorial optimization. Typically, we are dealing with a 
one-dimensional set of objects (“jobs”) that need to be assigned to a finite set of 
containers (“machines”). Problems of this type can also be interpreted as (one- 
dimensional) packing problems, and they are NP-hard in the strong sense, as 
problems like 3-Partition are special cases. 

Starting from this basic scenario, there are different generalizations that have 
been studied. Many scheduling problems have precedence eonstraints on the se- 
quence of jobs. On the other hand, a great deal of practical packing problems 
consider higher- dimensional instances, where objects are axis-aligned boxes in- 
stead of intervals. More-dimensional packing problems arise in many industries, 
where steel, glass, wood, or textile materials are cut. The three-dimensional 
problem is important for practical applications such as container loading. 

In this paper, we give the first study of problems that comprise both general- 
izations: these are higher-dimensional packing problems with order constraints — 
or, from a slightly different point of view, higher-dimensional scheduling prob- 
lems. In higher-dimensional packing, these problems arise when dealing with 
precedence constraints that are present in many container-loading problems. 
Another practical motivation to consider multi-dimensional scheduling problems 



F. Dehne, J.-R. Sack, and R. Tamassia (Eds.): WADS 2001, LNCS 2125, pp. 300-^^^ 2001. 
@ Springer- Verlag Berlin Heidelberg 2001 



Higher-Dimensional Packing with Order Constraints 301 




Fig. 1. An FPGA and a set of five jobs, shown 
(as rectangles) in regular two-dimensional x,y- 
space and (as boxes) in three-dimensional 
space-time x, y, t. All jobs must be placed in- 
side the chip and must not overlap if executed 
simultaneously on the chip. 



Fig. 2. Projecting the boxes of a 
feasible packing onto the coordi- 
nate axes defines interval graphs 
(here in 2D: Gx and Gy). 



arises from optimizing the reconfiguration of a particular type of computer chips 
called FPGA’s — described below. 

Field- Programmable Gate Arrays and More- Dimensional Scheduling. A par- 
ticularly interesting class of instances of three-dimensional orthogonal pack- 
ing arises from a new type of reconfigurable computer chips, called field- 
programmable gate arrays (FPGA’s). 

An FPGA typically consists of a regular rectangular grid of equal config- 
urable cells (logic blocks) that allow the prototyping of simple logic functions 
together with simple registers and with special routing resources (see Figure^. 
These chips (see e.g. HEOl) may support several independent or interdependent 
jobs and designs at a time, and parts of the chip can be reconfigured quickly 
during run-time. (For more technical details on the underlying architecture, see 
our previous paper PB|, and the more recent abstract P|.) Thus, we are faced 
with a general class of problems that can be seen as both scheduling and packing 
problems; two dimensions correspond to the space of the chip area, while the 
third dimension corresponds to time. In this paper, we develop a set of math- 
ematical tools to deal with these higher- dimensional scheduling problems. We 
show that our methods are suitable for solving instances of interesting size to 
optimality. 

Related Work. It is easy to see that any higher-dimensional packing problem 
(possibly with precedence constraints on the temporal order) can be relaxed to 
a resource-constrained scheduling problem. However, there are examples with as 
few as eight jobs showing that the converse is not true, even for small instances 
of two-dimensional packing problems without any precedence constraints: An 
optimal solution for the corresponding resource-constrained scheduling problem 
may not give rise to a feasible arrangement of rectangles for the original packing 
problem. 
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Higher-dimensional packing problems (without order constraints) have been 
considered by a great number of authors, but only few of them have dealt 
with the exact solution of general two-dimensional problems. See for an 
overview. It should be stressed that unlike one-dimensional packing problems, 
higher-dimensional packing problems allow no straightforward formulation as 
integer programs: After placing one box in a container, the remaining feasible 
space will in general not be convex. Moreover, checking whether a given set of 
boxes fits into a particular container (the so-called orthogonal packing problem, 
OPP) is trivial in one-dimensional space, but NP-hard in higher dimensions. 

Nevertheless, attempts have been made to use standard approaches of math- 
ematical programming. Beasley and Hadjiconstantinou and Christofides HSl 
have used a discretization of the available positions to an underlying grid to get 
a 0-1 program with a pseudopolynomial number of variables and constraints. 
Not surprisingly, this approach becomes impractical beyond instances of rather 
moderate size. More recently, Padberg I2ni gave a mixed integer programming 
formulation for three-dimensional packing problems, similar to the one antici- 
pated by Schepers 1201 in his thesis. Padberg expressed the hope that using a 
number of techniques from branch-and-cut will be useful; however, he did not 
provide any practical results to support this hope. 

In jtiltilHI 1 Ul2Kj ■ a different approach to characterizing feasible packings and 
constructing optimal solutions is described. See FigureElfor a visual description. 
A graph-theoretic characterization of the relative position of the boxes in a feasi- 
ble packing (by so-called packing classes) is used, which represent d-dimensional 
packings by a tuple of d interval graphs (called eomponent graphs) . Any d-tuple 
of graphs Gi arising in this manner must satisfy the following conditions: 

Cl: Gi is an interval graph, Vi € {1, • • • , d}. 

C2: Any independent set S of Gi is i-admissible, Vi £ {1, • • • , d}, i.e., Wi{S) = 

boxes in S must fit into the container in the ith 

dimension. 

C3: nf^iEi = 0. In other words, there must be at least one dimension in which 

the corresponding boxes do not overlap. 

A d-tuple of component graphs satisfying these necessary conditions is called 
a packing class. The remarkable property (proven in jBESj) is that these three 
conditions are also sufficient for the existence of a feasible packing. 

Theorem 1 

A d-tuple of graphs Gi = {V, Ei) corresponds to a feasible packing, iff it is a 
packing class, i. e., if it satisfies the conditions Cl, C2, C3. 

This factors out a great deal of symmetries between different feasible pack- 
ings, it allows to make use of a number of elegant graph-theoretic tools (like the 
characterizations reported in IIJKIj l. and it reduces the geometric problem to 
a purely combinatorial one without using brute-force methods like introducing 
an underlying coordinate grid. Combined with good heuristics for dismissing 
infeasible sets of boxes [Zj , a tree search for constructing feasible packings was 
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developed. This exact algorithm has been implemented; it outperforms previous 
methods by a clear margin. See our previous papers for details. 

Graph Theory of Order Constraints. In the context of scheduling with precedence 
constraints, a natural problem is the following, called transitive ordering with 
preeedenee eonstraints (TOP): Consider a partial order P = (V, of precedence 
constraints and a (temporal) comparability graph G = (V,E), such that all 
relations in P are represented by edges in G. Is there a transitive orientation 
D — {V, A) of G, such that P is contained in D1 

Korte and Mohring jIB| have given a linear-time algorithm for deciding TOP. 
However, their approach is only useful when the full set of edges in G is known. 
When running a branch-and-bound algorithm for solving a scheduling problem, 
these edges of G are only known partially, but they may already prohibit the 
existence of a feasible solution for a given partial order P. This makes it desirable 
to come up with structural characterizations that are already useful when only 
parts of G are known. 

Results of this paper. In this paper, we give the first exact study of higher- 
dimensional packing with order constraints, which can also be interpreted as 
higher- dimensional scheduling problems. We develop a general framework for 
problems of this type by giving a pair of necessary and sufficient conditions 
for the existence of a solution for the problem TOP on graphs G in terms of 
forbidden substructures. Using the concept of packing classes described above, 
our conditions can be used quite effectively in the context of a branch-and- 
bound framework, since it can recognize infeasible subtrees at “high” branches 
of the search tree. In particular, we describe how to find an exact solution to the 
problem of minimizing the height of a container of given base area. If this third 
dimension represents time, this amounts to minimizing the makespan of a higher- 
dimensional scheduling problem. We validate the usefulness of these concepts and 
results by providing computational results. Other problem versions (like higher- 
dimensional knapsack or bin packing problems with order constraints) can be 
treated similarly. 

The rest of this paper is organized as follows. In Section El we describe 
basic assumptions and some terminology. In Section 0 we introduce precedence 
constraints, describe the mathematical foundations for incorporating them into 
the search, and explain how to implement the resulting algorithms. Finally, we 
present computational results for a number of different benchmarks in Section 0 

2 Preliminaries 

Problem instances. We assume that a problem instance is given by a set of 
jobs V. All our results can be applied to instances in arbitrary fixed dimension. 
For the purposes of this abstract, the reader may wish to focus on the scenario 
where each job has a spatial requirement in the x- and y-direction, denoted by 
Wx(v) and Wy{v), and possibly a duration^ denoted by a size Wt{v) along the 
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time axis. The available space H consists of an area of size hx x hy. In addition, 
there may be an overall allowable time ht for all jobs to be completed. 

Graphs. Some of our descriptions make use of a number of different graph 
classes. An (undirected) graph G = {V, E) is given by a set of vertices V, and a 
set of edges E-, each edge describes the adjacency of a pair of vertices, and we 
write {u, ru} for an edge between vertices u and w. For a graph G, we obtain the 
complement graph G by exchanging the set E of edges with the set E of non- 
edges. In a directed graph D = (V,A), edges are oriented, and we write (u,w) 
to denote an edge directed from u to w. A graph G = (V,E) is a, comparability 
graph if the edges E can be oriented to a set of directed arcs A, such that we 
get a (transitively closed) partial order, i.e., a cycle-free digraph for which the 
existence of edges (u,v) G A and (v,w) € A for any u,v,w G V implies the 
existence of (u, w) G A. 

Precedence constraints. Mathematically, a set of precedence constraints 
is given by a partial order P = (V, ~<) on V . The relations in ^ form a directed 
acyclic graph Dp = (V,Ap), where Ap is the set of directed arcs. In the presence 
of such a partial order, a feasible schedule is assumed to satisfy the capacity 
constraints of the container, as well as these additional constraints. 

Packing problems. In the following, we treat jobs as axis-aligned multi- 
dimensional boxes with given orientation, and feasible schedules as arrangements 
of boxes that satisfy all side constraints. This is implied by the term of a feasible 
packing. There may be different types of objective functions, corresponding to 
different types of packing problems. The Orthogonal Packing Problem (OPP) is 
to decide whether a given set of boxes can be placed within a given “container” 
of size hx y. hy X ht- For the Constrained OPP (COPP), we also have to satisfy 
a partial order P = (V, -<) of precedence constraints in the t-dimension. (To em- 
phasize the motivation of temporal precedence constraints, we write t to suggest 
that the time coordinate is constrained, and x and y to imply that the space 
coordinates are unrestricted. Clearly, our approach works the same way when 
dealing with spatial restrictions.) 

There are various optimization problems that have the OPP or COPP as their 
underlying decision problem. Since our main motivation arises from dynamic chip 
reconfigurations, where we want to minimize the overall running time, we focus 
on the Constrained Strip Packing Problem (CSPP), which is to minimize the 
size ht for a given base size hx x hy, such that all boxes fit into the container 
hx X hy X ht- Clearly, we can use a similar approach for other objective functions. 



3 Packing Problems with Precedence Constraints 

As mentioned in the introduction, a key advantage of considering packing classes 
is that it allows to deal with packing problems independent of precise geometric 
placement, and that it allows arbitrary feasible interchanges of placement. How- 
ever, for most practical instances, we have to satisfy additional constraints for 
the temporal placement, i.e., for the start times of jobs. For our approach, the na- 
ture of the data structures may simplify these problems from three-dimensional 
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Fig. 3. (a) A two-dimensional packing class, 
with Gx representing x-, Gy representing y- 
projections. (b) The corresponding comparabil- 
ity graphs, (c) A transitive orientation, (d) A 
feasible packing corresponding to the orienta- 
tion. 



Fig. 4. An example of a com- 
parability graph Gt = (V, Et) 
with a partial order P contained 
in Et, such that no transitive 
orientation of Gt extends P. 



to purely two-dimensional ones: If the whole schedule is given, all edges Et in 
one of the graphs are determined, so we only need to construct the edge sets 
Ex and Ey of the other graphs. As worked out in detail in |2ZEH!, this allows 
it to solve the resulting problems quite efficiently if the arrangement in time is 
already given. 

A more realistic, but also more involved situation arises if only a set of 
precedence constraints is given, but not the full schedule. We describe in the 
following how further mathematical tools in addition to packing classes allow 
useful algorithms. 



3.1 Packing Classes and Interval Orders 

An important concept used frequently in the following is the concept of a com- 
ponent graph. Any edge {vi,V 2 \ in a component graph Gi corresponds to an 
overlap between the projections of boxes 1 and 2 onto the aj^-axis. This means 
that the complement graph Gi given by the complement Et of the edge set Ei 
consists of all pairs of coordinate intervals that are “comparable”: Either the 
first interval is “to the left” of the second, or vice versa. Any (undirected) graph 
of this type is a so-called comparability graph (see jI3] for further details). By 
orienting edges to point from “left” to “right” intervals, we get a partial order of 
the set V of vertices, a so-called interval order Obviously, this order relation 
is transitive, i.e., e f and f A g imply e ^ g, which is the reason why we also 
speak of a transitive orientation of the undirected comparability graph Gi. See 
Figure 0 for a (two-dimensional) example of a packing class, the corresponding 
comparability graph, a transitive orientation, and the packing corresponding to 
the transitive orientation. 
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Now consider a situation where we need to satisfy a partial order P = (K, Ap) 
of precedence constraints in the time dimension. It follows that each arc a = 
(u,w) G Ap in this partial order forces the corresponding undirected edge e = 
{m, ic} to be excluded from Et- Thus, we can simply initialize our algorithm 
for constructing packing classes by fixing all undirected edges corresponding to 
Ap to be contained in Et- After running the original algorithm, we may get 
additional comparability edges. As the example in Figure 21 shows, this causes 
an additional problem: Even if we know that the graph Gt has a transitive 
orientation, and all arcs a = (u, w) of the precedence order (V, Ap) are contained 
in Et as e = {u,w}, it is not clear that there is a transitive orientation that 
contains all arcs of Ap. 

3.2 Finding Feasible Transitive Orientations 

Consider a comparability graph G that is the complement of an interval graph 
G. The problem TOP of deciding whether G has a transitive orientation that 
extends a given partial order P has been studied in the context of scheduling, 
where G is the comparability graph of an interval order. For this scenario, Korte 
and Mohring m give a linear-time algorithm for determining a solution for 
TOP, or deciding that none exists. Their approach is based on a very special 
data structure called modified PQ-trees. 

In principle, it is possible to solve higher-dimensional packing problems with 
precedence constraints by adding this algorithm as a black box to test the leaves 
of our search tree for packing classes: In case of failure, backtrack in the tree. 
However, the resulting method cannot be expected to be reasonably efficient: 
During the course of our tree search, we are not dealing with one fixed com- 
parability graph, but only build it while exploring the search tree. This means 
that we have to expect spending a considerable amount of time testing similar 
leaves in the search tree, i.e., comparability graphs that share most of their graph 
structure. It may be that already a very small part of this structure that is fixed 
very “high” in the search tree constitutes an obstruction that prevents a feasible 
orientation of all graphs constructed below it. So a “deep” search may take a 
long time to get rid of this obstruction. This makes it desirable to use more 
structural properties of comparability graphs and their orientations to make use 
of obstructions already “high” in the search tree. 

3.3 Implied Orientations 

As in the basic packing class approach, we consider the component graphs Gi and 
their complements, the comparability graphs Gi. This means that we continue 
to have three basic states for any edge: (1) edges that have been fixed to be 
in Ei, i.e., component edges] (2) edges that have been fixed to be in Et, i.e., 
comparability edges] (3) unassigned edges. 

In order to deal with precedence constraints, we also consider orientations of 
the comparability edges. This means that during the course of our tree search, 
we can have three different possible states for each comparability edge: (2a) 
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Fig. 5. Two types of implications for edges and 
their orientations: Above are the arrangements 
that trigger path implications (Dl, left) and 
transitivity implications (D2, right); below are 
the forced orientations of edges. 



Fig. 6. (a) A graph Gt with 
a partial order formed by three 
directed edges; (b) there are 
three path implication classes 
that each have one directed arc; 
(c) carrying out path impli- 
cations creates directed cycles, 
i.e., transitivity conflicts. 



one possible orientation; (2b) the opposite possible orientation; (2c) no assigned 
orientation. 

A stepping stone for this approach arises from considering the following two 
configurations - see Figure 0 

The first consists of the two comparability edges {vi,V2}, {f2j C Et, such 
that the third edge {ui,t;3} has been fixed to be an edge from the component 
graph Et. Now any orientation of one of the comparability edges forces the 
orientation of the other comparability edge, as shown in the left part of the figure. 
We call this arrangement a path implication, since this configuration corresponds 
to an induced path in G;, 

The second configuration consists of two directed comparability edges, say, 
the edges (^1,^2) and (^2,^3). In this case we know that the edge {ui,t>3} must 
also be a comparability edge, with an orientation of (ui, V3). Since this configura- 
tion arises directly from transitivity in G;, we call this arrangement a transitivity 
implication. 

Clearly, any implication arising from one of the above configurations can 
induce further implications. 

In particular, when considering only sequences of path implications, we get a 
partition of comparability edges into path implication classes. Two comparability 
edges are in the same implication class, iff there is a sequence of path implica- 
tions, such that orienting one edge forces the orientation of the other edge. For 
an example, consider the arrangement in Figure 0 Here, all three comparability 
edges {ui, U2}, {v2, fa}, and {U3, V4} are in the same path implication class. Now 
the orientation of (^^1,^2) implies the orientation (v3,V2), which in turn implies 
the orientation (113, V4), contradicting the orientation of {^3, V4} in the given par- 
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tial order P. It is not hard to see that the implication classes form a partition 
of the comparability edges, since we are dealing with an equivalence relation. 
We call a violation of a path implication a path conflict. 

As the example in Figure Elshows, only excluding path conflicts when recur- 
sively carrying out path implications does not suffice to guarantee the existence 
of a feasible orientation: Working through the queue of path implications, we 
end up with a directed cycle, which violates a transitivity implication. 

We call a violation of a transitivity implication a transitivity conflict. 
Summarizing, we have the following necessary conditions for the existence of 
a transitive orientation that extends a given partial order P: 

Dl: Any path implication can be carried out without a conflict. 

D2: Any transitivity implication can be carried out without a conflict. 

These necessary conditions are also sufficient: 



Theorem 2 (Fekete, Kohler, Teich) 

Let P = {V ,Ap) be a partial order with arc set Ap that is contained in the 
edge set E of a given comparability graph G = {V,E). Ap can be extended to a 
transitive orientation of G, iff all arising path implications and transitivity can 
be carried out without creating a path conflict or a transitivity conflict. 

A proof and further mathematical details are described in our report jS|. 
The interested reader may take note that we are extending previous work by 
Gallai m who extensively studied implication classes of comparability graphs. 
In particular, we use the concept of modular decomposition. For more background 
on implication classes and comparability graphs, see Kelly \n\, Mohring 122 ] 
for informative surveys on this topic, and Kramer for an application in 
scheduling theory. 



3.4 Solving Problems with Precedence Constraints 

We start by fixing for all arcs {u,v) G A the edge {u,v} as an edge in the 
comparability graph Gt, and we also fix its orientation to be {u,v). In addition 
to the tests for enforcing the conditions for unoriented packing classes (Cl, C2, 
C3), we employ the implications suggested by conditions Dl and D2. For this 
purpose, we check directed edges in Gt for being part of a triangle that gives rise 
to either implication. Any newly oriented edge in Gt gets added to a queue of 
unprocessed edges. Like for packing classes, we can again get cascades of fixed 
edge orientations. If we get an orientation conflict or a cycle conflict, we can 
abandon the search on this tree node. The correctness of the overall algorithm 
follows from Theorem[2 in particular, the theorem guarantees that we can carry 
out implications in an arbitrary order. 
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Fig. 7. Block diagram of a video-codec (H.261) (left); precedence constraints for the 
video-codec (right). 



4 Computational Experiments 

In the following we present our results for different types of instances: The video- 
codec benchmark described in Section EH arises from an actual application to 
FPGA’s. In Section 14.21 we give a number of results arising from different geo- 
metric packing problems. 

Our code was implemented in C-|— I- and tests were carried out on a SUN 
Ultra 2. 



4.1 Video- Codec Benchmark 

Figure Q shows a block diagram of the operation of a hybrid image sequence 
coder/decoder that arises from the FPGA application. The purpose of the coder 
is to compress video images using the H.261 standard. In this device, transforma- 
tive and predictive coding techniques are unified. The compression factor can be 
increased by a predictive method for motion estimates: blocks inside a frame are 
predicted from blocks of previous images. See our report ^ for more technical 
details. The result is shown in Tabled 



4.2 Geometric Instances 

Here we describe computational results for two types of two-dimensional objects. 
See Table 13 for an overview. The first class of instances was constructed from a 



Table 1. Optimizing reconfigu- Table 2. Optimal packing with order con- 

rations for the Video-Codec straints 



test 


container sizes 


CPU-time (s) 


fit 


fix 


fly 


1 


59 


64 


64 


24.87 s 



instance 


optimal 

ht 


fix 


upper 

bound 


lower 

bound 


okpl7-0 


169 


100 


7.29 s 


179 s 


okpl7-l 


172 


100 


6.73 s 


1102 s 


okpl7-2 


182 


100 


5.39 s 


330 s 


okpl7-3 


184 


100 


236 s 


553 s 


okpl7-4 


245 


100 


0.17 s 


0.01 s 


square21-no 


112 


112 


84.28 s 


0.01 s 


square21-mat 


117 


112 


15.12 s 


277 s 


square21-tri 


125 


112 


107 s 


571 s 


square21-2mat 


[118,120] 


[118,120] 


346 s 


476 s 
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Table 3. The okpl7 problem in- 
stances. 

okpl7: base width of container = 100, 
number of boxes = 17 
sizes ^ [(8, 81), (5, 76), (42, 19), (6, 80), 

{9, 52), (41, 48), (6, 86), (58, 20), 
(99, 3), (100, 14), (7, 53), (24, 54), 
(23, 77), (42, 32), (17, 30), (11,90), 

(26,65)] 

okpl7-0: no order constraints 
okpl7-l: 11^8, 11^16 
okpl7-2: 11^8, 11^16, 8^16 
okpl7-3: 11^8, 11^16, 8^16, 8^17, 
11 ^ 7 , 16^7 

okpl7-4: 11^8, 11^16, 8^16, 8^17, 
11 ^ 7 , 16 ^ 7 , 17^16 



Table 4. The square21 problem instances. 

square21: base width of container — 112, 
number of boxes — 21 
sizes = [(50, 50). (42, 42), (37, 37), (35. 35), 

(33, 33), (29. 29), (27, 27), (25, 25). 

(24, 24), (19. 19), (18, 18), (17, 17). 

(16, 16^15. 15^11, 11). (9. 9). 

(8,8), (7,7),(6,6),(4,4),(2,2)] 

square21-0: no order constraints 
square21-mat: 2^4, 6^7, 8^9, 11^15, 16^17, 
18^19, 24^25, 27^29, 33^35, 
37^42, 2^50, 50^4 
square21-tri: 2^15, 15^17, 2^27, 4— >-16, 
16^29, 4^29, 6^17, 17^33, 
6^33, 7^18, 18^35, 7^35, 

8^19, 19^37, 8^37, 9^24, 
24^42, 9^42, 11^25, 25^50, 

11^50 

square21-2mat: ai-constraints: 

2^19, 6^25, 8^29, 11^35, 
16^42, 18^4, 24^7, 27^9, 
33^15, 37^17, 50^4, 18^50 
^-constraints: 

2^4, 6^7, 8^9, 11^15, 16^17, 
18^19, 24^25, 27^29, 33^35, 
37^42, 2^50, 50^4 




Fig. 8. r4n optimal solution 
for the instance okpl7-l. 



Fig. 9. An optimal solution for 
the instance square21-mat. 
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particularly difficult random instance of 2-dimensional knapsack (see j^l) ■ Results 
are given for order constraints of increasing size. In order to give a better idea 
of the computational difficulty, we give separate running times for finding an 
optimal feasible solution, and for proving that this solution is best possible. 

See Table |5|for the exact sizes of the 17 rectangles involved for the geometric 
layout of optimal packings. For easier reference, the boxes are labeled 1-17 in 
the order in which they are listed in the table. 

The second class of instances arises from the well-known tiling of a 112x112 
square by 21 squares of different sizes. Again, we have added order constraints 
of various sizes. (See Table El for details.) For the instance square21-2mat (with 
order constraints in two dimensions), we could not close the gap between upper 
and lower bound. For this instance, we report the running times for achieving 
the best known bounds. 

Two examples of resulting packings are shown in Figures 0 and H 
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Abstract. We investigate a variant of the bin packing problem in which 
items may be fragmented into smaller size pieces called fragments. While 
there are a few applications to bin packing with item fragmentation, our 
model of the problem is derived from a scheduling problem present in 
data over CATV networks. Fragmenting an item is associated with a 
cost which renders the problem NP-hard. We study two possible cost 
functions and as a result get two variants of bin packing with item frag- 
mentation. In the first variant, called bin packing with size-increasing 
fragmentation, each item may be fragmented in which case overhead 
units are added to the size of every fragment. In the second variant each 
item has a size and a cost and fragmenting an item increases its cost 
but does not change its size. We call this variant bin packing with size- 
preserving fragmentation. 

We develop several algorithms for the problem and investigate their per- 
formance. The algorithms we present are based on well known bin pack- 
ing algorithms such as Next-Fit and First-Fit Decreasing, as well as of 
other algorithms. . . . 



1 Introduction 

Because of its applicability to a large number of applications and because of 
its theoretical interest bin packing has been widely researched and investigated 
(see, e.g., 13, m, El and 13 for a comprehensive survey). In the classical one- 
dimensional bin packing problem, we are given a list of items L = (oi, U 2 , ..., a„), 
each with a size s (oi) G (0,1] and are asked to pack them into a minimum 
number of unit capacity bins. Since the problem, as many of its derivatives, is 
NP-hard many approximation algorithms have been developed for it (see, e.g., 
P]>P>|S| and for a survey) . The common assumption in bin packing problems 
is that an item may not be fragmented into smaller pieces. There are several 
applications, however, in which this assumption does not hold. The subject of 
item fragmentation in bin packing problems received almost no attention so far. 
This paper concentrates on aspects that were heretofore never researched, such 
as developing algorithms for the problem and investigating their performance. 

The variant of bin packing presented in this paper is derived from a schedul- 
ing problem present in data over CATV (community antenna television) net- 
works. In particular we refer to Data-Over-Cable Service Interface Specification 
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(DOCSIS), standard of the Multimedia Cable Network System (MCNS) stan- 
dard committee, for a detailed description see m- When using CATV networks 
for data communication the data subscribers are connected via a cable modem to 
the headend. The headend is responsible for the scheduling of all transmissions 
in the upstream direction (from the cable modem to the headend). Scheduling 
is done by dividing the upstream, in time, into a stream of numbered mini- 
slots. The headend receives requests from the modems for allocation of datagram 
transmission. The length of each datagram can vary and may require a different 
number of mini-slots. From time to time, the headend publishes a MAP in which 
it allocates mini-slots to one modem or a group of modems. The scheduling prob- 
lem is that of allocating the mini-slots to be published in the MAP, or in other 
words, how to order the datagrams transmission in the best possible way. 

The headend must consider two kinds of datagram allocations: 

1. Fixed Location - Allocations for connections with timing demands, such as 
a CBR (constant bit rate) connection. These connections must be scheduled 
so as to ensure delivering the guaranteed service. Fixed location datagrams 
are therefore scheduled in fixed, periodically located mini-slots. 

2. Free Location - Allocations for connections without timing demands, such 
as a best effort connection. Free location datagrams can use any of the mini- 
slots. 

The headend therefore performs the allocation in two stages: in the first stage 
it schedules, or allocates, all fixed location datagrams. We assume that after the 
fixed allocations have been made, a gap of Lf mini-slots is left between successive 
fixed allocations. In the second stage all free location datagrams are scheduled. 
The free allocations must fit into the gaps left by the fixed allocations. 

The relation to the bin packing problem should now be clear. The items are 
the free location datagrams that should be scheduled, each of which may require 
a different number of mini-slots. The bins are defined by the gaps between every 
two successive fixed allocations in the MAP. The goal is to use the available 
mini-slots in the MAP in the best way. We point out that the practical schedul- 
ing problem may actually be somewhat more complicated since the bins (gaps 
between fixed allocation) may not be of uniform size. Dealing with variable size 
bins is beyond the scope of this paper (we present some results in PH). 

One of the capabilities of the system is the ability to break a datagram into 
smaller pieces called fragments. When a datagram is fragmented, i.e., transmitted 
in non successive mini-slots, extra bits are added to the original datagram to 
enable the reassembly of all the fragments at the headend. In a typical CATV 
network one mini-slot is added to every fragment. 

To make the problem of bin packing with item fragmentation nontrivial a cost 
must be associated with fragmentation. We study two possible cost functions 
and as a result get two variants of bin packing with item fragmentation. In 
the first variant, called bin packing with size-increasing fragmentation, the cost 
function adds one (or more) overhead unit to the size of every fragment. In 
the second variant the cost function increases the cost of an item upon each 
fragmentation, but does not change its size. We call this variant, bin packing 
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with size preserving fragmentation. The scheduling problem, where the cost is 
due to the extra overhead bits that are added to each fragment, serves as a model 
to the first variant. The second variant is suitable for problems where the cost 
associated with fragmentation is a result of extra processing time or reassembly 
delay. The two cost functions we present here are not the only possible cost 
functions. In other applications, for example, the cost may be related to the 
size of the item, a combination of the two cost functions is also possible. It is 
interesting to note that when the cost associated with fragmentation is ignored 
the packing problem becomes trivial, and when the cost is very high it does not 
pay to fragment items and we face the classical problem. Hence, the problem is 
interesting with the middle-range costs. It has been shown in m that for non 
zero cost the ability to fragment items does not reduce the complexity of the 
problem, that is, the problem of bin packing with item fragmentation is NP-hard. 

We present worst case analysis of both variants of bin packing with item frag- 
mentation. We begin by showing that the two variants are NP-hard in the strong 
sense. We then devise approximation algorithms for the problem and investigate 
their performance. We restrict our attention to practical bin packing algorithms, 
i.e., of low order polynomial running time, and examine both online and offline 
algorithms. Online algorithms are applicable to cases where the items arrive in 
some order and must be assigned to the bins as soon as they arrive. Offline al- 
gorithms assume the entire list of items is known before the packing begins. We 
devise algorithms which are based on well known bin packing algorithms but 
include the capability of fragmenting items. We investigate the performance of 
algorithms such as Next-Fit (NF) and First-Fit Decreasing (FFD), as well as 
of other algorithms. 

The reminder of the paper is organized as follows. In SectionQwe address the 
problem of bin packing with size-increasing fragmentation. Section 0 is devoted 
to the problem of bin packing with size-preserving fragmentation. 

2 Bin Packing with Size-Increasing Fragmentation 

In this section we study the variant of bin packing with item fragmentation where 
fragmentation increases the size of an item. We define the problem similar to 
the classical bin packing problem. The classical bin packing problem deals with 
equal-sized (unit capacity) bins and a list of items each of which can fit in every 
bin. To handle fragmentation we use a discrete version of the problem and add 
a fragmentation cost function that adds overhead units to each fragment. We 
proceed to formally define the problem. 

Bin Packing with Size-Increasing Fragmentation (BP-SIF): We are given 
a list of items L = (oi, « 2 , ..., &«), each with a size s{ai) € {1, 2, ..., U}. The items 
must be packed into a minimum number of bins, which are all the size of U units. 
When packing a fragment of an item, one unit of overhead is added to the size 
of every fragment. 

Performance Ratio: We use the same definition as is typically used in analyz- 
ing the classical problem. For a given list L and algorithm A, let A{L) be the 
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number of bins used when algorithm A is applied to list L, let OPT{L) denote 
the optimum number of bins for a packing of L, and let Ra{L) = A{L)/OPT{L). 
The asymptotic worst case performance ratio is defined to be: 

= inf{r > 1 : for someN > 0, Ra{L) < r for all L withOPT(L) > N} (1) 

The bin packing problem is know to be NP-hard in the strong sense ^21 • We 
show that the complexity of BP-SIF is the same. 

Claim. BP-SIF is NP-hard in the strong sense. 

Proof. We denote by D(BP-SIF) the decision version of BP-SIF and show that 
it is NP-complete in the strong sense. We do so by reducing the 3-PARTITION 
problem to a restricted instance of D (BP-SIF). The 3-PARTITION problem 
(defined formally below) is known to be NP-complete in the strong sense |2|. 

3-PARTITION: given a list L oin = 3m integers: wi,W 2 , ...,Wn and a bound 
B G such that R/4 < wj < R/2 for j = 1, ...,n and = mB, can 

L be partitioned into m disjoint subsets ..., Sm such that ~ ^ 

1 = 1, ..., m? 

We define D(BP-SIF) as follows: given a list of items L, a size s{a) G for 
each item a G L, a, positive integer bin capacity U and a positive integer K, is 
there a feasible packing of L in iF bins of size U? Any instance I of 3-PARTITION 
can be polynomially transformed into an equivalent instance I' of D (BP-SIF) by 
setting U = B and AT = m. To realize the two decision problems are equivalent 
note that, since the total size of all items is equivalent to the total capacity 
of all bins, in any ”yes” instance of D (BP-SIF) all n items are packed without 
fragmentation and the packing is therefore also valid for 3-PARTITION. Clearly 
the packing of any ”yes” instance of 3-PARTITION is also valid for D(BP-SIF). 
It follows that the ”yes” and ”no” instances of the two problems are equivalent. 

□ 

When packing a list of n items into m bins, the maximum number of frag- 
mentations possible is n • TO (each item is fragmented over all bins). From the 
definition of the problem it is obvious that a good algorithm should try to per- 
form the minimum number of fragmentations. Therefore we would only like to 
consider algorithms that do not fragment items unnecessarily. 

Definition 1 : An algorithm A is said to prevent unnecessary fragmentation if 
it follows the following two rules: 

1. No unnecessary fragmentation: An item (or fragment of an item) is frag- 
mented only if it is to be packed into a bin that cannot contain it. In case of 
fragmentation, the item (or fragment) is divided into two fragments. The first 
fragment must fill one of the bins. The second fragment is packed according 
to the packing rules of the algorithm. 

2. No unnecessary bins: An item is packed in a new bin only if it cannot fit in 
any of the open bins used by A. 

Algorithms that prevent unnecessary fragmentation have the following prop- 
erty. 
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Lemma 1. For any algorithm A that prevents unnecessary fragmentations - 
R’a — TT^’ Z®’’ every U > 2. 



Proof. Assume that the number of bins used by the algorithm is A (L) = m. 
Since A prevents unnecessary fragmentation it can perform at most m — 1 frag- 
mentations while packing m bins (no fragmentation in the last bin), regardless of 
the size of the list. Each fragmentation adds two units of overhead at the most. 
Therefore, in the worst case, 2m — 1 units of overhead are added to the total 
size of all items. Note that in this case only the last bin may be left unfilled. 
Assuming the optimal packing does not fragment any item, the number of bins 
used by it satisfies: OPT (L) > 

The asymptotic performance ratio follows: 



Ra {L) 



A (L) ^ mil 

OPT (L) - (to - 1) ([/ - 2) -h 1 



Ra< 



U -2 
U 



□ 

Remark: For the more general case, where r units of overhead (instead of one) 
are added to the size of every fragment, it can be shown by similar arguments, 
that: ^ ^ 

We now have an upper bound on the performance ratio of any algorithm. In 
the remainder of this section we investigate specific algorithms to find their actual 
performance ratio. For a given algorithm A we define a version of the algorithm 
that allows item fragmentation and denote it by Af. We investigate the worst 
case performance ratio of the following algorithms: NFf, NFDf (NFIf) and 
FFDf (BFDf). 

2.1 Next-Fit with Item Fragmentation - NFf 

The NFf algorithm is defined similar to the NF algorithm. 

Algorithm NFf - In each stage there is only one open bin. The items are 
packed, according to their order in the list L, into the open bin. When an item 
does not fit in the open bin, it is fragmented into two parts. The first part fills 
the open bin and the bin is closed. The second part is packed into a new bin 
which becomes the open bin. Offline version of the algorithm sorts the items in 
decreasing (NFDf) or increasing (NFIf) order before packing the list. 

The NFf algorithm is very simple, can be implemented to run in linear time 
and requires only one open bin (bounded space). However, as we show next, 
similar to the classical problem, the performance ratio the algorithm achieves is 
the worst possible. 

Theorem 1. For algorithm NFf - R^^p^ = V C/ > 6. 

Proof. Lemma 1 provides an upper bound on the performance ratio of the 
algorithm. We present an example that proves the lower bound. Let us first 
consider the case where the bin size U is an even number. As a worst case 
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example we choose the following list L: The first item is of size U/2, the 
next (^ — 2 ) items are of size 1. The rest of the list repeats this pattern kU 
times. The optimal packing avoids fragmentations by packing bins with two 
items of size U/2 or U items of size 1. The total number of bins used is: 
OPT{L) = + (^ — 2 ) /c = {U — 2)k. On the other hand, algorithm NFj 

fragments each item of size U/2 (except for the first item). Overall 2{kU — 1) 
units of overhead are added to the packing and therefore the number of bins 
used by NFf is: 



NFf{L) 



'U, ^kU-1 

2 ^ + 2 — 



/u \ 






r 21 




k 


— 


Uk-- 



= Uk. 



(2) 



A worst case example for the case where U is an odd number is similar. The 
first item in L is of size {U — l)/2, the next — l) items are of size 1. The 
rest of the list repeats this pattern kU times. It is easy to verify that, as in the 
previous example, OPT{L) = {U — 2)k, while NFf{L) = kU. □ 



When the bin size is very small the above proof does not hold. For the values 
3 < [/ < 5 we show in HH that the worst case asymptotic performance ratio is: 

poo 3 

^NFf — 2 ' 

It is interesting to compare the NFf algorithm to the classic NF algorithm 
for which the asymptotic worst case performance ratio is: V C/ > 

1. While the performance ratio of iVA is increasing with U the performance ratio 
of NFf is decreasing with U. This is intuitive since as the bin size gets larger 
the effective cost of fragmentation gets smaller. 



NFDf and NFIf Algorithms. Given the poor performance of the NFf 
algorithm, one may ask whether sorting the items before applying the NFf 
packing, would yield better results. It turns out that algorithms NFDf and 
NFIf are very similar to the NFf algorithm in their worst case performance. 
The actual performance ratio of these algorithms depends on the bin size U, but 
in all cases it is not far from the ratio of NFf. To avoid dealing with each value 
of U separately, we first present an example for a general value. Our list of items 
is made of k items of size U — 2 and k items of size 2. The optimal packing uses k 
bins where each bin contains two items, one of each kind. The NFDf algorithm 
first packs k items, of size C/-2, into k bins. Then k-1 items of size 2 are packed 
into \2{k — 1) /[/] bins if U is even, or \2 {k — 1) / {U — 1)] bins if it is odd. 
The NFIf algorithm will use the same number of bins, since the only difference 
is that the items are packed in reverse order. This simple example gives us the 
following lower bounds: 

RNFDf = ^NFif > U >6 (3) 

RNFDf = R-NFIf > ^^y u >5 (4) 
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The above example provides a lower bound. We now demonstrate that in 
some cases the NFDj and NFIf algorithms can perform just as bad as NFf. 
The first example is for U =5 and a list L = {3, 3, 2, 2}, for which = 

^NFif = ^NFf = §■ To see that such examples are not restricted to low values 
of bin size, consider U = 32 and choose a list of 15fc items of size 10 and 15A: 
items of size 6. In the optimal packing the content of each bin is {10, 10, 6, 
6}, therefore OPT{L) = 7.5k. The total number of bins used by the algorithms 
is: NFDf{L) = NFIf{L) = 8k, since all bins (save two) contain two units of 
overhead. The performance ratio of all three algorithms in this case is || which 
is the worst possible ratio. On the other hand for some values of bin size, NFIf 
and NFDf have a better performance ratio than NFf. For example when U = 6, 

poo 4 poo 3 

^NFDf — ^NFIf — 3 ’ Wime -TtjVF/ ~ 2 ’ 

2.2 First-Fit Decreasing with Item Fragmentation - FFDf 

We now develop an algorithm based on the FFD heuristic. Let us first describe 
algorithm FFDf which packs items from a list L, into a fixed number of m bins. 
Algorithm FFDf - First-Fit Decreasing with item fragmentation: The algo- 
rithm packs the items in decreasing order. An item is packed into the lowest 
indexed bin into which it fits. If an item does not fit into any bin it is frag- 
mented. When fragmenting an item the first fragment fills the lowest indexed 
bin that is as yet not full. If the second fragment can be packed without frag- 
mentation, it is packed into the lowest indexed bin into which it fits, otherwise 
another fragmentation is performed according to the above rule. 

Remark: Other definitions are possible. For example, upon fragmentation we 
may choose to insert the second fragment back to the list. Another possibility 
is to first go over the whole list and pack items without fragmentation and only 
then pack the remaining items into the available free space. 

Note that FFDf may not be able to pack all the items in L. To ensure all 
items are packed we use the following iterative algorithm: 

Algorithm FFDf Iterative {FFDf — I) - The FFDf — I algorithm tries to 
pack the list L into a fixed number of m bins. If it fails it increases m by one 
and tries again. Let s{L) be the sum of all items in L, the first value of m is: 
mi = \s{L)/U~\, which is the minimum number of bins possible. 

The algorithm performs the following steps: 

1. Set m = mi = |"s (L) /U~\. 

2. Try to pack the list L into m bins using the FFDf algorithm. 

3. If all items were packed stop. 

4. Otherwise set to = m -I- 1 and go to step 2. 

It is interesting to see if FFDf — I improves the performance ratio of NFf. 
As we shell see the improvement is significant for small values of bin size. 

Theorem 2. The asymptotic worst case performance ratio of the FFDf — I 
algorithm satisfies: 
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(i) RWpDf-i — TJ^ when U < 15 

( ii) < R"ffds-i < TT^ U >16 

Proof. To prove the theorem we use a property which we call the border bin prop- 
erty. We assume FFDf — I has m bins to pack and number the bins Bi, ..., Bm 
according to the order they are opened by the algorithm. Looking at the level 
(used space) of the bins at a certain time during algorithm execution, we say that 
bin Bj, j < TO, is a border bin at that time if its level satisfies: L{Bj) ^ L{Bj+x). 

Claim. At any time before the first item is fragmented, the packing of FFDf — I 
contains at most 2U border bins. 

Proof. The FFDf — I algorithm packs the items in decreasing size order. We 
consider the number of border bins before the first item is fragmented. Denote 
by BR{k) the number of border bins, after k different sizes of items have been 
packed by FFDf — I. Clearly BR{1) < 2 since items of the same size are packed 
in a similar way. Whenever each additional size is packed the number of border 
bins may increase by at most two, therefore BR{k) < BR{k — 1) + 2. Since there 
are at most U different sizes, BR{U) < 2U, and the claim follows. □ 

We now proceed to prove the theorem for each range separately. 

(i) U < 15 : We establish as an upper bound on the asymptotic per- 

formance ratio. It is clear that as long as the number of bins with two units of 
overhead is small, i.e., 0(1), the asymptotic performance ratio cannot exceed 
(the proof is similar to that of Lemma 1). We make the following observa- 
tion: 

Claim. The final packing of the FFDf — I algorithm cannot contain more than 
0(1) bins with 2 units of overhead, if one of the following conditions is met: 

1. Before the first fragmentation occurs, the free space in the bins is 2 or less. 

2. The fragmented items are of size 4 or less. 

Proof. Omitted. 

It is easy to verify that the conditions set by the above claim cannot be 
extended, since fragmenting items of size 5 over bins of size 3 results in one 
third of the bins containing 2 units of overhead. Therefore, in order to get a 
significant number of bins with 2 units of overhead, items of size 5 or more 
must be fragmented. This means that the list should contain items of size 6 
or more. Note that if FFDf — I fragments items of sizes s(oi) > they are 
also fragmented by the optimal packing. We conclude that in order to create 
a difference of more than one overhead unit, between the optimal packing and 
the packing of FFDf — I, two items of size s(ai) > 6 must be packed in a bin 
and leave a free space of at least 3. To do so, the bin size must satisfy U > 15. 
However, in the case of U=15, items of size 5 are packed without fragmentation 
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and therefore the value is an upper bound on the asymptotic performance 
ratio for [/ < 15. 



(ii) t/ > 16 : We first prove the lower bound and then the upper bound. 

Claim. For the FFDf — I algorithm - 

Proof. For each value U > 16, there is a list for which the performance ratio 
exceeds jj^- We present here an example only for the case where U is even, an 
example for odd values of U is similar. Assume U = 2u and choose a list L with 
k items of size u — 2, then k items of size u — 3 and finally k items of size 5. The 
optimal packing fills k bins each with one item of each size, OPT{L) = k. When 
FFDf — I is applied to L the first k bins contain all items except for the last 
k/4: items of size 5. In the best case (t/ is a multiple of 5) no more overhead is 
produced and more bins are used. The total number of bins required by the 
algorithm is: FFDf — I{L) > {k + bk/AU). The performance ratio in this case 
is: 

□ 



We now turn to the upper bound on the performance ratio. 

Claim. For the FFDf — I algorithm - 

Proof. In order to prove the claim we show that the algorithm cannot produce 
a packing where each bin (maybe except for a negligible number) contains two 
units of overhead. The border bins property implies that just before the first item 
is fragmented the bins are arranged in long sequences where the free space in all 
bins is equal. The items are also packed in long sequences. Let us assume the free 
space in a sequence is x and the size of the items packed is y. Obviously x < y 
otherwise the items are not fragmented. Note that when an item is fragmented 
over more than two bins, only the first and last bins can contain two units of 
overhead. Therefore it is enough to consider only the case where an item is 
fragmented over two bins. Assume that item number k is fragmented over two 
bins such that bin number I contains a fragment of size a and bin number 2 
contains a fragment of size y—a. The next item (number A:+l) is also fragmented 
and a fragment of size x — y + a — 2 is packed in bin number 2. In order to create 
a repeated cycle of fragmentations the size of the first fragment of each item 
must be equal, that is a = x — y + a — 2. This is true for y = x — 2, but for that 
value the items are not fragmented in the first place. Since we can not create a 
repeated cycle of equal size fragmentations, the size of the first fragment packed 
in a bin increases, until it reaches the size of the bin, in which case only one unit 
of overhead is packed in the bin. This means that at least one of every y bins 
contains only one unit of overhead. Since y < U/2 we can establish that: 



R 



< 



U 



< 



u 



FFDf-I - 2 ' U -2 



( 6 ) 



This concludes the proof of Theorem |21 



□ 
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The improvement of FFD f — I over NFf is significant for low values of bin 
size (U < 15), which are the most meaningful values (since the performance 
ratio is decreasing with the bin size). Moreover, FFDf — / is superior for any 
value of U. We make the following remarks: 

— For each of the following values: U S {7, 9, 10, 11, 13, 14, 15}, it is possible to 

find an example where the ratio is hence |11|- 

— When U < 5, = 1, however when U G {6,8} R^pp,^_i > 1- This 

is interesting, since for the classical problem the performance ratio of FFD 
is: Rfpp) = 1. 

— We may define algorithm BFDf — / in a similar way to FFDf — /, only this 
algorithm is based on the Best-Fit Decreasing heuristic. We can show that 
the BFDf — I algorithm has the same performance ratio as FFDf — I CH 

3 Bin Packing with Size-Preserving Fragmentation 

In this section we study a different fragmentation cost function in which frag- 
menting an item does not increase its size. Instead, we assume that packing an 
item is associated with a cost and fragmentation increases this cost. The cost 
of packing an item (or fragmenting it) depends on the application. As an ex- 
ample, take the scheduling problem but assume that fragmenting a datagram 
does not increase its size, because the format of the datagram already includes 
the fragmentation fields. On the other hand fragmentation requires additional 
resources from the system (CPU, memory) and takes longer to process. In other 
applications, such as in stock-cutting problems, it may simply cost to fragment 
an item (cut a piece of pipe for example) or put it back together. We proceed to 
formally define the problem. 

Bin Packing with Size-Preserving Fragmentation (BP-SPF): We are 

given a list of n items L = (ui, 02 , ..., a„), each with a size s(oi) G (1,2, ...,t/} 
and a cost c(oi) G . The items must be packed into m identical bins, of size 
U. It is possible to fragment any item, in which case one unit is added to its cost 
but does not change its size. The goal is to minimize the total cost. 

Denote by s{L) and c{L) the total size and cost of all items, respectively. To 
ensure all items can be packed, we assume s{L) < mil . 

Performance: There are several ways to evaluate the performance of an algo- 
rithm for the problem. We observe that since the cost of fragmentation is not 
related to the size or cost of an item, the additional cost of an algorithm depends 
only on the number of fragmentations it performs. We therefore chose to evaluate 
the performance of an algorithm by its overhead. For a given list L and algorithm 
A, let c(A, L) be the total cost of algorithm A, let c{OPT, L) denote the optimal 
(minimal) cost and define the overhead of A as: OHa{L) = c {A, L)—c{OPT, L). 
For the case of OPT{L) = m, we define the worst case overhead of algorithm A, 
as: 

OFIa = inf{/i : OHa{P) < h for all L with OPT{L) = m}. (7) 

We first show that the complexity of BP-SPF is similar to that of BP-SIF. 
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Claim. BP-SPF is NP-hard in the strong sense. 

Proof. The proof is similar to that of BP-SIF and is omitted from this version. 

We consider only algorithms that prevent unnecessary fragmentation (see 
Definition 1). Such algorithms have the following property. 

Lemma 2. For any algorithm A, that prevents unnecessary fragmentations - 
OH^ < m - 1. 

Proof. Since A prevents unnecessary fragmentations, it may perform at most 
m — 1 fragmentations when packing m bins. The maximum cost of algorithm A 
is therefore: c{A,L) = c{L) + {m — 1). Clearly c{OPT,L) > c{L), which means 
that for any list L: OH a (L) < m — 1. □ 

We now examine the performance of the NFf and FFDf algorithms (defined 
in subsections 12. II and 12.21 respectively) . We show that the performance of the 
NFf algorithm is the worst possible while FFDf performs better. 

Theorem 3. The overhead of algorithm NFf for every m > 2 is - OHffp = 
m-1, Vt7>2. ^ 

Proof. Lemma 0 provides an upper bound. As a worst case example choose a 
list of items with one item of size U — 1 followed by m — 1 items of size U. □ 

We now turn to the FFDf algorithm. We expect FFDf to perform better 
than NFf and this is indeed true when the bin size U is small. However, we 
show that if the bin size is not bounded, the worst case overhead of FFDf is 
the maximum possible, that is, there exist a list L for which, for any value of m, 
c {FFDf, L)-c {OPT, L)=m-l. 

Claim. For the FFDf algorithm, for every m > 2 - OH'ffpp^ = m — 1. 

Proof. We choose a bin size satisfying: U > 2m + 16. The list, L, is made of k 
repetitions of the following set: L' = {U/2 + 2, {7/2 + 1, {7/4+2, {7/4+1, UjA — 
3, {7/4 — 3}. Two bins of size U are needed to pack L' , therefore m = 2k bins are 
needed to pack L. The optimal packing causes no fragmentations. The FFDf 
algorithm packs the items in the following way: The first 5k items are packed 
without fragmentations. The free space in the first k bins is {7/4 — 4, in the 
remaining k bins the free space is 1. Since the free space in all bins is smaller 
than the size of the items, the items are fragmented. Each item except the last is 
fragmented over two bins. When packing the last item, bins Bi,...Bfc_i are full, 
bin Bk has free space of ^ — A: — 3 and each bin Bfe+i,...,B 2 fe has free space of 1. 
As a result, the last item is fragmented over the remaining free space, causing 
k more fragmentations. The total number of fragmentations is m — 1 and the 
overhead is therefore OHffpp^ = m — 1. □ 

To hold for every value of m the above claim requires an unbounded bin 
size. Since we are mainly interested in asymptotic behavior, i.e., U m, this is 
clearly not a practical assumption. For the more reasonable cases where m > U 
the overhead of algorithm FFDf is less than m — 1. 
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Claim. For any bin size U there is > 0 for which the overhead of FFDf 
satisfies - OFlppjj^ < m — 1, 'i N <m. 

Proof. We show that if t/ <C m then c (A, L) < c (L) + {m — 1). Recall that when 
analyzing the FFDf — I algorithm (subsection E3) we proved the border bin 
property. The property tells us that before the first item is fragmented the bins 
are ordered in long sequences of equal content bins. Note that by increasing m 
while keeping U constant, we can create sequences of any length. Consider the 
moment before FFDf fragments the first item. Let us assume the free space in 
a sequence is x and the size of the items packed is y, where x < y. Clearly at 
least one out of every x ■ y bins is closed without fragmentation. The overhead 
is therefore always smaller than m — 1. □ 
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Abstract. We describe fully polynomial time approximation schemes 
for generalized multicommodity flow problems arising in VLSI applica- 
tions such as Global Routing via Buffer Blocks (GRBB). We extend 
Fleischer’s improvement Q of Garg and Konemann 0 fully polynomial 
time approximation scheme for edge capacitated multicommodity flows 
to multiterminal multicommodity flows in graphs with capacities on ver- 
tices and subsets of vertices. In addition, our problem formulations ob- 
serve upper bounds and parity constraints on the number of vertices on 
any source-to-sink path. Unlike previous works on the GRBB problem 
pni, our algorithms can take into account (i) multiterminal nets, (ii) 
simultaneous buffered routing and compaction, and (iii) buffer libraries. 
Our method outperforms existing algorithms for the problem and has 
been validated on top-level layouts extracted from a recent high-end mi- 
croprocessor design. 



1 Introduction 

In this paper, we address the problem of how to perform buffering of global 
nets given an existing buffer block plan. We give integer linear program (ILP) 
formulations of the basic Global Routing via Buffer Blocks (GRBB) problem and 
its extensions to (i) multiterminal nets, (ii) simultaneous buffered routing and 
compaction, and (iii) buffer libraries. The fractional relaxations of these ILP’s 
are separable packing LP’s (SP LP) which are multiterminal multicommodity 
flows in graphs with capacities on vertices and subsets of vertices. 
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The main contribution of this paper is a practical algorithm for the GRBB 
problem and its extensions based on a fully polynomial time approximation 
scheme (FPTAS) for solving SP LPs. Prior to our work, heuristics based on 
solving fractional relaxations followed by randomized rounding have been applied 
to VLSI global routing As noted in El, the applicability of this 

approach is limited to problem instances of relatively small size by the prohibitive 
cost of solving exactly the fractional relaxation. We avoid this limitation by 
giving an FPTAS for SP LP’s based on results in m- Computational experience 
with industrial benchmarks shows that our approach is practical and outperforms 
existing algorithms. 

The rest of the paper is organized as follows. In Section Owe formulate the 
GRBB problem and its extensions as integer linear programs. The fractional 
relaxation of these ILPs is a special type of packing LP which we refer to as sep- 
arable packing LP. In Sections 0 we give a practical approximation algorithm, 
obtained by extending the ideas of Fleischer [2] for separable packing LPs; the 
details of the key subroutine for finding minimum-weight feasible Steiner trees 
are given in Sectional the details of randomized rounding algorithms are in Sec- 
tion 6. In SectionQwe describe implementations of several GRBB heuristics and 
give the results of an experimental comparison of these heuristics on industrial 
test cases. 

2 Global Buffering via Buffer Blocks 

Process scaling in VLSI leads to an increasingly dominant effect of interconnect 
on high-end chip performance. Each top-level global net must undergo repeater 
or buffer (inverter) insertion to maintain signal integrity and reasonable signal 
delay It is estimated that up to 10® repeaters will be needed for the next 
generation on-chip interconnect. To isolate repeaters from circuit block imple- 
mentations, a buffer block methodology is becoming increasingly popular. Two 
recent works by Gong, Kong and Pan ^ and Tang and Wong El give algo- 
rithms to solve the buffer block planning problem. Their buffer block planning 
formulation is roughly stated as follows: Given a placement of circuit blocks, 
and a set of 2-pin connections with feasible regions for buffer insertion, plan the 
location of buffer blocks within the available free space so as to route a maximum 
number of connections. 

In this paper we address the problem of maximizing the number of routed 
nets for given buffer block locations and capacities, informally defined as follows. 

Given: 

— a planar region with rectangular obstacles; 

— a set of nets in the region, each net having: 

• a non-negative importance (criticality) coefficient; 

• a single source and multiple sinks; 
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— for each sink: 

• a parity requirement and an upper-bound on the number of buffers on 
the path connecting it to the source; 

— a set of buffer blocks, each with given capacity; and 

— an interval [L, U\ specifying lower and upper bounds on the distance between 
buffers. 

Global Routing via Buffer Blocks (GRBB) Problem: route a subset of 
the given nets, with maximum total importance, such that: 

— the distance between the source of a route and its first repeater, between 
any two consecutive repeaters, respectively between the last repeater on a 
route and the route’s sink, are all between L and U ; 

— the number of routing trees passing through any given buffer block does not 
exceed the block’s capacity; 

— the number of buffers on each source-sink path does not exceed the given 
upper bound and has the required parity; to meet the parity constraint two 
buffers of the same block can be used. 

We also address the following extensions of the basic GRBB problem: 

— GRBB with Set Gapacity Gonstraints. The basic GRBB problem as- 
sumes predetermined capacities for all buffer blocks. In practice buffer blocks 
are placed in the space available after placing circuit blocks, and some of the 
circuit blocks can still be moved within certain limits (Figured). The GRBB 
problem with set capacity constraints captures this freedom by allowing con- 
straints on the total capacity of arbitrary sets of buffer blocks. 




Fig. 1. Two buffer blocks BBl and BB2 that share capacity: if the circuit block 
M moves to the right, then the capacity of buffer block BBl is increasing while 
the capacity of buffer block BB2 is decreasing. In this example it is the sum 
of capacities of BBl and BB2, rather than their individual capacities, that is 
bounded. 
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GRBB with Buffer Library. To achieve better use of area and power 
resources, multiple buffer types can be used. The GRBB problem with bujfer 
library optimally distributes the available buffer block capacity between given 
buffer types and simultaneously finds optimum buffered routings. 

3 Integer Linear Program Formulations 

Throughout this paper we let = (s^,; t\, . . . G'k), k = 1,. . . ,K, denote the 
nets to be routed; Sk is the source, and t\, . . . are the sinks of net Nk- 
We denote by (/fc > 1 the importance (criticality) coefficient of net Nk, and by 
a\. € {even, odd} and /} > 0 the prescribed parity, respectively upper bound, 
on the number of buffers on the path between source Sk and sink t\. We also 
let S = |si,...,Si^} and S' = {t}, . . . , . . . , . . . , denote the set of 

sources, respectively of sinks, and R = |ri,...,r„} denote the given set of 
bujfer blocks. For each buffer block r^, we let c(ri) denote its capacity, i.e., the 
maximum number of buffers that can be inserted in r^. 

A routing graph for nets Nk, k = 1, . . . , AT, is an undirected graph G = (V, E) 
such that S G S' C V . The set of vertices of G other than sources and sinks, 
F \ (A U S"), is denoted by V . All vertices in a routing graph are associated to 
locations on the chip, including vertices of V which are associated with buffer 
block locations. We require that the rectilinear distance with obstacles between 
two vertices connected by an edge in the routing graph be either between L 
and 17 or 0 (this last case corresponds to using two buffers in the same buffer 
block) . Thus, inserting a buffer at each Steiner point ensures that every Steiner 
tree in the routing graph satisfies the given L/U bounds. A feasible Steiner tree 
for net Nk is a Steiner tree Tk connecting terminals Sk,t\,..., tf‘ such that, for 
every i = 1, ... ,qk, the path of Tk connecting Sk to has length at most l\, and 
parity aj. We denote the set of all feasible Steiner trees for net Nk hy Tk, and 
ietr = utirfc. 

For the GRBB problem, the routing graph G = (V, E) has 

V = S U S' U {r',r" \ r £ Rj (there are two vertices corresponding to 
each buffer block to allow for feasible Steiner trees that meet the parity con- 
straints by using two buffers in the same buffer block) and E = {(r', r") \ r £ i?| 
U{(a;,?/) I x,y £ V, L < d{x,y) < U}, where, d{x,y) is the rectilinear distance 
with obstacles between points x and y. Given importance coefficients gk = g{Nk) 
for each net Nk, let g(T) = gk for each tree T G Tk, k = 1,. . . ,K. The GRBB 
problem is then equivalent to the following integer linear program: 



maximize 




(GRBB ILP) 


subject to 


J2Ter''^T{v)fT < 1, 


Vv G S' U S" 




'EreA^Tir') + 7rr(r"))/T < c(r). 


Vr G i? 




/t G (0, 1}, 


vtg r 



where ttt{v) is 1 if w G T and 0 otherwise. 




Practical Approximation Algorithms for Separable Packing Linear Programs 



329 



The GRBB ILP, as well as the ILP formulations for GRBB with set 
constraints and buffer library (which we omit from this extended abstract) are 
captured by the following common generalization, referred to as the separable 
packing ILP (SP ILP): 

maximize ILP) 

subject to 

7rT(f)s(u)) /t < c{X), VX e V 
/t e {0, 1}, VT G r 

for given 

— arbitrary sets Tk of Steiner trees for each net N^; 

— family V of subsets of V such that {u} G V for every v G SU S' \ 

— “size” function s : P — >■ R+ such that s{v) = 1 for every v G S U S' ; and 

— “set-capacity” function c : V — >■ such that c({u}) = 1 for every v G SUS'. 

Our two-step approach to the GRBB problem and its extensions is to first 
solve the fractional relaxations obtained by replacing integrality constraints 
/t G {0,1} with /t > 0, and then use randomized rounding to get integer 
solutions. In next section we give an algorithm for approximating the fractional 
relaxation of the SP ILP. The algorithm relies on a subroutine for finding min- 
imum weight feasible Steiner trees, the details of this subroutine are given in 
Section 0 

4 Approximating the SP ILP Relaxation 

The fractional relaxation of the SP ILP can be solved exactly in polynomial time 
using, e.g., the ellipsoid algorithm. However, exact algorithms are highly imprac- 
tical. The SP LP can be efficiently approximated within any desired accuracy 
using Garg and Konemann’s approximation scheme for packing LPs 0. The 
main step of their algorithm is computing the minimum weight column of the 
LP. For the special case of edge-capacitated multicommodity flow LPs, Fleischer 
0 gave a significantly faster algorithm by computing in each step the minimum 
weight column only among columns corresponding to a single commodity. Be- 
low we generalize Fleisher’s idea to separable packing LPs by partitioning the 
columns into groups corresponding to the nets. 

4.1 The Algorithm 

Our algorithm simultaneously finds feasible solutions to the SP LP and its 
dual. The dual LP asks for an assignment of non-negative weights w{X) to 
every X G V such that the weight of every tree T G T is at least 1, where 
the weight of T is defined by weight{T) = 

7^t(X) = J2v(^x^t{v)s{v): 
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Input: Nets A^i, . . . , Nk, coefficients gi, . . . , qk, routing graph G — {V, E), family V 
of subsets of V , capacities c{X), X €V, and weights s(v), v ^ V 
Output: SP LP solution fT, T G E 



For every T € T, /t 0 
For every X gV, w{X) <— S 
a G- Sir 

For i = 1 to t = [logj^_|_^ 

For fc = 1 to IF do 

Find a minimum weight feasible Steiner tree T \n Tk 
While weightiT) < min{l, (1 + e)d} do 
/t /t + 1 

For all X GV, w(X) G- w{X){l + e-KriX) / c{X)) 

Find a minimum weight feasible Steiner tree T in Tfc 
End while 
End for on k 
a •<— (1 + e)d 
End for on i 

For every T gT, fr G- 

logi + , 

Output fr, T G T 



Fig. 2. The algorithm for finding approximate solutions to the SP LP. 



maximize SxgV w{X)c{X) (SP LP Dual) 

subject to 

^ExgVMX)w{X)> 1, 'iTGT 
w{X) >0, MX GV 

In the following we assume that minj^fe : k = 1,. . . ,K} = 1 (this can be easily 
achieved by scaling) and denote max{(jifc : fc = 1, . . . , K} by F. 

The algorithm (Figure |2) starts with weights w{X) = S for every X G V, 
where S is an appropriately chosen constant, and with a SP LP solution / = 0. 
While there is a feasible tree whose weight is less than 1, the algorithm se- 
lects such a tree T and increments Jt by 1. This increase will likely violate 
the capacity constraints for some of the sets in V; feasibility is achieved at the 
end of the algorithm by uniformly scaling down all /t’s. Whenever /t is incre- 
mented, the algorithm also updates each weight w(X) by multiplying it with 
(1 -F eTTT{X)/c{X)), for a fixed e. 

According to the Garg and Konemann’s approximation algorithm 0 each 
iteration must increment the variable /t corresponding to a tree with minimum 
weight among all trees in T. Finding this tree essentially requires K minimum- 
weight feasible Steiner tree computations, one for each net Nk- We reduce the 
total number of minimum-weight feasible Steiner tree computations during the 
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algorithm by extending a speed-up idea due to Fleischer Instead of always 
finding the minimum-weight tree in T, the idea is to settle for trees with weight 
within a factor of (1 -I- e) of the minimum. As shown in next section, the faster 
algorithm still leads to an approximation guarantee similar to that of Garg and 
Konemann. 



4.2 Runtime and Performance Analysis 



In each iteration the algorithm cycles through all nets. For each net, the al- 
gorithm repeatedly computes minimum-weight feasible Steiner tree until the 
weight becomes larger than (1 -|- e) times a lower-bound a on the overall mini- 
mum weight, min{weight(T) : T G T}. The lower-bound a is initially set to S/F, 
and then multiplied by a factor of (1 -I- e) from one iteration to another (note 
that no tree in T has weight smaller than (1 -|- e)o; at the end of an iteration, so 
(1 -|- e)d is a valid lower-bound for the next iteration). 

The scheme used for updating a fully determines the number of iterations in 
the outer loop of the algorithm. Since a = 5 / F in the first iteration and at most 
(1 -I- e) in the last one, it follows that the number of iterations is log^+j j • 
The following lemma gives an upper-bound on the runtime of the algorithm. 



Lemma 1. Overall, the algorithm in Figure n requires O (^K log 
minimum-weight feasible Steiner tree computations. 



(i+llT 

1+e S 



Proof. First, note that the number of minimum-weight feasible Steiner tree 
computations that do not contribute to the final fractional solution is 
K log]^_)_g j ■ Indeed, in each iteration, and for each net fV^, there is 

exactly one minimum-weight feasible Steiner tree computation revealing that 
mmTeTfc^^eig/it(T) > (1 -I- e)a, all other computations trigger the incrementa- 
tion of some /t. 

We claim that the number of minimum-weight Steiner trees that lead to 
variable incrementations is at most Alog]^_|_g To see this, note that 

the weight of the set {sfc} G V is updated whenever a variable /t, T G Tk, 
is incremented. Moreover, wds^}) is last updated when incrementing fx 
for a tree T G Tk of weight less than one. Thus, before the last update, 
ic({sfc}) < F ■ weight{T) < F. Since 7TT({sfc}) = c({sfc}) = 1, the weight of 
{sk} is multiplied by a factor of 1 -I- e in each update, including the last one. 
This implies that the final value of ic({sfc}) is at most (1 -|- e)F. Recalling that 
rc({sfc}) is initially set to S, this gives that the number of updates to w({sfc}) 
is at most log]^_|_g The lemma follows by summing this upper-bound over 

all nets. □ 



We now show that, for an appropriate value of the parameter S, the algorithm 
finds a feasible solution close to optimum. 
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Theorem 1. For every e < 0.15, the algorithm in Figure\^ computes a feasible 
solution to the SP LP within a factor of 1/(1 + 4e) of optimum by choosing 
5 = (1 + e)T((l + e)LF)~': ; the runtime of the algorithm for this value of S 
is O i^og L + log F)Ttree) ■ Here, L is the maximum number of vertices in 
a feasible tree, and Ttree is the time required to compute the minimum weight 
feasible Steiner tree for a net. 

Proof. Our proof is an adaptation of the proofs of Garg and Konemann jSj 
and Fleischer |Zj. We omit the proof that the solution found by the algorithm 
is feasible. To establish the approximation guarantee, we show that the solu- 
tion computed by the algorithm is within a factor of 1/(1 -I- 4e) of the opti- 
mum objective value, f3, of the dual LP. Let a{w) be the weight of a mini- 
mum weight tree from T with respect to weight function w : V — >■ i?+, and let 
D{w) = w{X)c{X). a standard scaling argument shows that the dual LP 

is equivalent to finding a weight function w such that D(w) / a(w) is minimum, 
and that j3 = miniu{U(t(;)/Q;(?i')}. 

For every X G V, let Wi{X) be the weight of set X at the end of the ith 
iteration and Wq{X) = (5 be the initial weight of set X. For brevity, we will 
denote a{wi) and D{wi) by a{i) and D{i), respectively. Furthermore, let // 
be the value of fr at the end of ith iteration, and hi = be the 

objective value of the SP LP at the end of this iteration. 

When the algorithm increments fx by one unit, each weight w{X) is increased 
by {€ttt{X)w{X)/c{X). Thus, the incrementation of fx increases D{w) by 

e TTx{X)w{X) = e weight{T)g{T) 
xeV 

If this update takes place in the ith iteration, then weight{T) < (1 -I- e)a(i — 1). 
Adding this over all fx’s incremented in ith iteration gives 

D(i) — D{i — 1) < e(l -I- e)a(i — l){hi — 

which implies that 

i 

D{i) - D{0) < e(l + e) X! “(■?’ “ ~ ^i-i) 

i=i 

Consider the weight function Wi — Wo, and notice that D(wi — wo) = D(i) — D{0). 
Since the minimum weight tree w.r.t. weight function Wi — wq has a weight of at 
most a{wi — Wo) + L6 w.r.t. Wi, a{i) < a{wi — wq) + LS. Hence, if a(i) — LS > 0, 
then 

^ ^ D(w^ - Wq) ^ D(i) - DiQ) ^ e(l + e) Yl]=i Q^(j ~ ~ ^i-i) 

“ a{wi — Wo) ~ a(i) — L5 ~ a{i) — L6 

Thus, in any case (when a{i) — LS <0 this follows trivially) we have 
a{i) <LS+ 
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Note that, for each fixed i, the right-hand side of last inequality is maximized 
by setting a{j) to its maximum possible value, say for every 0 < j < i. 

Then, the maximum value of a{i) is 



2 

a'{^) = LS + Vi) + - !)(/*, 

P j=i P 



hi-i) 



where the last inequality uses that 1 -|- a: < for every a; > 0. Using that 
a'(0) = L5, this gives 

e { 1-l-e) I 

a{i) < LSe 3 ‘ 

Let t be the last iteration of the algorithm. Since a{t) > 1, 



1 < LSe 



^(1+0 

13 



ht 



and thus 

ht ~ ln(L(5)“^ 

Let 7 = ^ log 2 _|_£ be the ratio between the optimum dual objective value 

and the objective value of the SP LP solution produced by the algorithm. By 
substituting the previous bound on /3//it we obtain 

^ e(l + e) logi+, ^ e(l + e) In 

ln(L(5)-i ln(l-be)ln(L(5)-i 

For (5= (l-be)T((l-be)Lr)-i, 

In^^^ ln((l + e)LT)- Mn(l + e)LU) 1 

ln(L5)-i ln((l-be)LT)”^+' In (1 -b e)LU) 1-e 

and thus 

< + + < (^ + ^) 

^ - (1 - e) ln(l + e) - (1 - e)(e - eV2) “ (1 - e)2 

Here we use the fact that ln(l -b e) > e — e^/2 (by Taylor series expansion 
of ln(l -b e) around the origin). The proof of the approximation guarantee is 
completed by observing that (1 -b e)/(l — e)^ < (1 + 4e) for every e < 0.15. The 
runtime follows by substituting 5 in the bound given by Lemma H n 
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5 Computing Minimum- Weight Feasible Steiner Trees 

The key subroutine of the approximation algorithm given in the previous sec- 
tion is to compute, for a fixed k and given weights w{X), X G V, a feasi- 
ble tree T G Tk minimizing weight{T) = Define a 

weight function w' on the vertices of the routing graph G = (V, E) by setting 
w’{v) = SugxgV = 'l2vev(T) bs the total vertex 

weight w.r.t. w' of T. Then weight(T) = w'(T), and the problem reduces to 
finding a tree T G Ek with minimum total vertex weight w.r.t. w' . 

Recall that for the GRBB problem and its extensions, Tk contains all Steiner 
trees connecting the source Sk with the sinks t\, . . . G'k such that the number 
of intermediate vertices on each tree path between Sk and has the parity 
specified by a], and does not exceed l\.. In this case we can further reduce the 
problem of finding the tree T G Ek minimizing w'{E) to the minimum- cost 
directed rooted Steiner tree (DRST) problem in a directed acyclic graph. Un- 
fortunately, the minimum-cost DRST problem is NP-hard, and the fact that Dk 
is acyclic does not help since there is a simple reduction for this problem from 
arbitrary directed graphs to acyclic graphs. As far as we know, the best result 
for the DRST problem, due to Charikar et al. 0, gives 0(log^ gfc)-approximate 
solutions in quasi-polynomial time Note, on the other hand, that 

the minimum-cost DRST can be found in polynomial time for small nets (e.g., 
in time for nets with at most M sinks, for M = 2,3,4); most of the 

nets in industrial VLSI designs fall into this category uni- For nets of small size. 
Theorem n immediately gives: 

Corollary 1. If the maximum net size is M < A, the algorithm in Figure n 
finds, for every e < 0.15, a feasible solution to the SP LP within a factor of 
1/(1 -I- 4e) of optimum in time O ( (log n -|- logT)) . 

We have implemented both heuristics that use approximate DRSTs instead 
of optimum DRSTs and heuristics that decompose larger nets into nets with 2-4 
pins before applying the algorithm in Figure 0 results of experiments comparing 
these approaches are reported in Section 0 

6 Rounding Fractional SP LP Solutions 

In the previous two sections we presented an algorithm for computing near- 
optimal solutions to the SP LP. In this section we give two algorithms based 
on the randomized rounding technique of Raghavan and Thomson [ I4j (see also 
d) for converting these solutions to integer SP ILP solutions. 

The first algorithm is to route net Nk with probability equal to fk = 
'^TeTk picking, for selected nets, one of the trees E G Ek with probability 

fr/ fk- A drawback of this algorithm is that it requires the explicit representa- 
tion of trees E G E with /(T) yf 0. Although the approximate SP LP algorithm 
produces a polynomial number of trees with non-zero /t, storing all such trees 
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Input: Net- and edge-cumulated /t values, fk = X/reTs, 

Me) = SreTfc: ees(T) /r, fe = 1, . . . , A, e G E{Dk) 

Output: Routed trees Tk £ Tk 



For each k = I, . . . , K, select net Nk with probability fk 
Route each selected net Nk as follows: 

Tk < — {^fc} 

For each sink tl in Nk do 
P ^ 0; v^ti 
While V ^ Tk do 

Pick arc (u,v) with probability 

{w ,v)^E 

P P U {(«, v)}; V « 

End while 
Tk Tk U P 

End for 



Fig. 3. The random walk based rounding algorithm. 



is infeasible for large problem instances. Our second rounding algorithm (Fig- 
ure m takes as input the net- and edge-cumulated fr values, fk = J^TeTk 
respectively fk(e) = Y^TaTk- e&E{T) M, thus using only 0{K\E\) space. 

As the first rounding algorithm, the algorithm in Figure |3 routes each net 
Nk with a probability of fk = 'YIiT^Tu difference is in how each chosen 

net is routed: to route net Nk-, the algorithm performs backward random walks 
from each sink of Nk until reaching either the source of Nk or a vertex already 
connected to the source. The random walks are performed in the directed acyclic 
graphs used for DRST computation, with probabilities given by the normalized 
/fc(e) values. 

On the average, the total importance of the nets routed by each of the two 
algorithm is 9k fk = "^t^t By Theoremni this is within a factor 

of 1/(1 -F 4e) of the optimum SP LP solution, which in turn is an upper-bound 
on the optimum SP ILP solution. Ensuring that no set capacity is exceeded 
can be accomplished in two ways. One approach is to solve the SP LP with 
set capacities scaled down by a small factor which guarantees that the rounded 
solution meets the original capacities with very high probability (see [II HL A 
more practical approach, extending the so-called greedy- deletion algorithm in 
p] to multiterminal nets, is to repeatedly drop routed paths passing through 
over-used sets until feasibility is achieved. 

7 Experimental Results 

We have implemented four greedy algorithms for the GRBB problem; all four 
greedy algorithms route nets sequentially. For a given net, the algorithms start 
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Fig. 4. Percent of sinks connected vs. CPU time. 



with a tree containing only the net’s source, then iteratively add shortest paths 
from each sink to the already constructed tree. The only difference is in whether 
or not net decomposition is used, and in the size of the decomposed nets. The 
first three algorithms — referred to as 2TG, 3TG, and 4TG, respectively — start 
by decomposing larger multiterminal nets into 2-, 3-, respectively 4-pin nets. 
The fourth algorithm, MTG, works on the original (undecomposed) nets. 

We have also implemented four algorithms that approximate the fractional 
solution to the SP LP corresponding to GRBB problem (which generalizes the 
node-capacitated multiterminal multicommodity flow problem) and then ap- 
ply randomized rounding. The first three algorithms (2TMGF, 3TMGF, and 
4TMGF) decompose larger nets into 2-, 3-, respectively 4-pin nets then call the 
algorithm in Figure |3 with exact DRST computations. The fourth algorithm, 
MTMGF, works on the original (undecomposed) nets, using shortest-path trees 
as approximate DRSTs in the SP LP approximation algorithm. 

Figure 0 plots the solution quality versus the GPU time (on a 195MHz SGI 
Origin 2000) of each implemented algorithm. The test cases used in our experi- 
ments were extracted from the next-generation (as of January 2000) micropro- 
cessor chip at SGI. The results clearly demonstrate the high quality of solutions 
obtained by rounding the approximate SP LP solutions. The MTMGF algorithm 
proves to be the best among all algorithms when the time budget is limited, pro- 
viding significant improvements over greedy algorithms without undue runtime 
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penalty. However, the best convergence to the optimum is achieved by 4TMCF, 

which dominates all other algorithms when high time budgets are allowed. 
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Abstract. In this talk, we will give an overview of how content is dis- 
tributed on the internet, with an emphasis on the approach being used 
by Akamai. We will describe some of the technical challenges involved 
in operating a network of thousands of content servers across multiple 
geographies on behalf of thousands of customers. The talk will be intro- 
ductory in nature and should be accessible to a broad audience. 
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Abstract. An upward embedding of an embedded planar graph states, 
for each vertex v, which edges are incident to v “above” or “below” 
and, in turn, induces an upward orientation of the edges. In this paper 
we characterize the set of all upward embeddings and orientations of a 
plane graph by using a simple flow model. We take advantage of such a 
flow model to compute upward orientations with the minimum number of 
sources and sinks of 1-connected graphs. Our theoretical results allow us 
to easily compute visibility representations of 1-connected graphs while 
having a certain control over the width and the height of the computed 
drawings, and to deal with partial assignments of the upward embeddings 
“underlying” the visibility representations. 



1 Introduction 



Let G be an undirected planar graph with a given planar embedding. Loosely 
speaking, an upward embedding (also called an upward representation) of G is 
given by splitting, for each vertex v of G, the ordered circular list of the edges 
that are incident to v into two linear lists Labove{v) and Lbeiow{v), in such a 
way that there exists a planar drawing F of G with the following properties: 
(i) all the edges are monotonically increasing in the vertical direction; (ii) for 
each vertex v the edges in LaboveW) {Lbeiow{v)) are incident to v above (below) 
the horizontal line through v. Drawing F is said to be an upward drawing of G. 
An orientation of all edges of F from bottom to top defines an orientation of 
all edges of G, which we call an upward orientation of G. Hence, any upward 
embedding of G induces an upward orientation of G. Figured shows an upward 
embedding of a plane graph and the upward orientation induced by it. 

Upward embeddings and orientations of undirected graphs have been widely 
studied within specific theoretical and application domains. For example, a deep 
investigation of the properties of upward embeddings and drawings for ordered 
sets and planar lattices can be found in 1161141117^ . Relations between the 
problem of finding bar layout drawings of weighted undirected graphs and the 
problem of computing upward orientations with specific properties are provided 
in ITCETT) . An important class of upward orientations is represented by the so 
called bipolar orientations (or st- orientations) . A bipolar orientation of an undi- 
rected planar graph G is an upward orientation of G with exactly one source s 
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Fig. 1. ( a) An embedded planar graph, (b) An upward embedding of the embedded 
planar graph. For each vertex Vi of the graph the edges in Lbeio-w{vi) and Laboveivi) are 
drawn incident below and above the horizontal line through Vi, respectively, (c) The 
upward orientation induced by the upward embedding. 



(vertex without in-edges) and one sink t (vertex without out-edges). A bipolar 
orientation of G with source s and sink t exists if and only if G U {(s,t)} is 
biconnected. Finding a bipolar orientation of a planar graph is often the first 
step of many algorithms in graph theory and graph drawing. The properties of 
bipolar orientations have been extensively studied in [Zl, and a characterization 
of bipolar orientations in terms of a network flow model is described in 0. 

Many results on upward embeddings of digraphs have been also provided in 
the literature. In this case, the orientation of the edges of the graph is given, and 
a classical problem consists in finding a planar upward embedding within such 
an orientation. Clearly, a planar upward embedding of a digraph might not exist. 
In P] a polynomial time algorithm for testing the existence of planar upward 
embeddings of a digraph within a given embedding is described. The algorithm 
is also able to construct an upward embedding if there exists one. In the variable 
embedding setting the upward planarity testing problem for digraphs is NP- 
complete El, but it can be solved in polynomial time for digraphs with a single 
source 0. 

In this paper we focus on upward embeddings and orientations of undirected 
planar graphs. The main contributions of our work are the following: 

— Starting from the properties on upward planarity given in P|, we provide 
a full characterization of the set of all upward embeddings and orientations 
of any embedded planar graph ISection 13.11) . It is based on a network flow 
model, which is related to the one used in p] for characterizing bipolar 
orientations. In particular, if the graph is biconnected, our flow model also 
captures all bipolar orientations of the graph. 

— We describe flow based polynomial time algorithms for computing upward 

embeddings of the input graph. Such algorithms allow us to deal with partial 
assignments of the upward embedding ISection l.’t.ljl . Further, we provide a 
polynomial time algorithm to compute upward orientations with the min- 
imum number of sources and sinks (Section An upward orientation 

with the minimum number of sources and sinks can be viewed as a natural 
extension of the concept of bipolar orientation to 1-connected graphs. 

— We describe a simple technique to compute visibility representations of 1- 
connected planar graphs (Section^, which can be of practical interest for 
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graph drawing applications. It is based on the computation of an upward 
embedding of the graph, and does not require running any augmentation 
algorithm to initially make the graph biconnected. Compared to a standard 
technique that uses the approximation algorithm in uni to make the graph 
biconnected, the algorithm we propose is faster and achieves similar results 
in terms of area of the visibility representation. We also present lSection |4.1l 
a preliminary experimental study that shows how our technique can be used 
to have a certain control over the width and the height of the visibility 
representations. 

In SectionQwe give some basic definitions and results on upward embeddings 
and orientations of undirected planar graphs. Due to space limitations, all the 
proofs of the theorems are omitted, and can be found in P| . 

2 Basic Definitions and Resnlts on Upward Embeddings 

Let G be a graph. A drawing C of G maps each vertex m of G into a point Pu 
of the plane and each edge (u,v) of G into a Jordan curve between pu and py. 
r is planar if two distinct edges never intersect except at common end-points. 
G is planar if it admits a planar drawing. A planar drawing A of G divides the 
plane into topologically connected regions called faces. Exactly one of these faces 
is unbounded, and it is said to be external; the others are called internal faces. 
Also, for each vertex v oi G, F induces a circular clockwise ordering of the edges 
incident on v. The choice (j) of such an ordering for each vertex of G and of an 
external face is called a planar embedding of G. A planar graph G with a given 
planar embedding (f) is called an embedded planar graph and denoted by G^. A 
drawing of is a planar drawing of G that induces (j) as the planar embedding. 

Let Gff, be an (undirected) embedded planar graph. An upward embedding 
of Gcf, is a splitting of the adjacency lists of all vertices of G^ such that: (Pr.a) 
for each vertex v of G^ the circular clockwise list L{v) of the edges incident 
on V is split into two linear lists, Lbeiow{v) and Labove{v), so that the circular 
list obtained by concatenating Laboveiv) and the reverse of Lbeiow{v) is equal to 
L(v); (Pr.b) there exists a planar drawing F{£^) of G^j, such that all the edges 
are monotonically increasing in the vertical direction and for each vertex v of 
G^ the edges of Lbeiow{v) and LaboveW) are incident to v below and above the 
horizontal line through v, respectively. We say that F{£^) is a drawing of £(f, and 
an upward drawing of G,^. 

An upward embedding £^ of G^ uniquely induces an upward orientation 
of G^. Namely, for each edge e = (u, v) such that e G Labove{u) and e G Lbeiow{v), 
we orient e from u to v (see Figure^). Conversely, an upward orientation defines 
in general a class of possible upward embeddings inducing that orientation. A 
source of £^ is a vertex v of G^ such that Lbeiow (v) is empty. A source has only 
out-edges with respect to orientation O^. A sink of £^ is a vertex v of G^ such 
that Labove(v) is empty. A sink has only in-edges with respect to 

Given a vertex v of G^, we denote by deg{v) the number of edges incident on 
V. An angle of G^ at vertex u is a pair of clockwise consecutive edges incident 
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on V. In particular, if deg{v) = 1, and if we denote by e the edge incident on 
V, {e, e} is an angle. Given a splitting of the adjacency lists of that verifies 
Pr .a, an angle {ei, 62} at vertex v of can be of three types. Large: (i) both 
ei and 62 belong to Lteiow{v) (Laboveiv)), and (ii) ei and 62 are the first (last) 
edge and the last (first) edge of Lbeiow{v) {Labove{v)), respectively. We associate 
a label L with a large angle. Flat: (i) ei S Lbeiow{v) and 62 S Labove{v) or, (ii) 
Cl S Labove{v) and 62 G Lbeiow{v). We associate a label F with a flat angle. 
Small: in all the other cases. We associate a label S with a small angle. 

Figure El shows the labeling of the angles of a graph determined by an 
upward embedding £^. Each drawing of maps the angles of G^ to geometric 
angles such that large and small angles always correspond to geometric angles 
larger and smaller than 180 degrees, respectively. Both the two edges that form 
a large or a small angle at vertex v are incident to v either above or below 
the horizontal line through v. Instead, a flat angle at vertex v corresponds to 
a geometric angle that can be either larger or smaller than 180 degrees; in any 
case, an edge of the angle is incident to v above the horizontal line through v 
while the other edge is incident to v below the same line. 




Fig. 2. Labeling of the angles of an embedded planar graph determined by an upward 
embedding of the graph. 



Let / be a face of G^. We call border of / the alternating circular list of the 
vertices and edges that form the boundary of /. Note that, if the graph is not 
biconnected an edge or a vertex may appear more than once in the border of /. 
We say that an angle {ei, 62} at vertex v belongs to face / if ei, 62, and v belong 
to the border of /. The degree of /, denoted by deg{f), is the number of edges 
in the border of /. Observe that, deg{f) is equal to the number of angles of /. 

Consider now any labeling of the angles of G^ with labels L, S, and F. For 
each face / of G^ denote by L{f), S{f), and F{f) the number of angles that 
belong to / with label L, S, and F, respectively. Also, for each vertex v of G^ 
denote by L(u), S'(u), and F(v) the number of angles at vertex v with label L, 
S, and F, respectively. The following lemma is a restatement of a known result 
on upward planarity [3. 



Lemma 1. Let £^ be a splitting of the adjacency lists of G^ that verifies Pr.a, 
and consider the labeling of the angles of G^ determined by it. £^ is an upward 
embedding of G^ if and only if the following properties hold: 
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(a) S{f) = L{f) + 2, for each internal face f of G,j,. 

(b) s\f) = L{f) — 2, for the external face f of G^. 

(c) F{v) = 2, S{v) = deg{v) — 2, and L{v) = 0, for each vertex v of G^ such 
that both Labove{v) and Lbeiow{v) are not empty. 

(d) F{v) = 0, S{v) = deg{v) — 1, and L{v) = 1, for each vertex v of G^j, such 
that either Labove(v) or Lbeiow{v) is empty. 

Properties (c) and (d) of Lemmadstate that if is an upward embedding of 
G^, each source or sink of has exactly one large angle and no flat angles, while 
each vertex that is neither a source nor a sink has exactly two flat angles and 
no large angles. The next lemma provides a different formulation for properties 
(a) and (b). 

Lemma 2. Properties (a) and (b) of Lemma dare equivalent to the following 
properties: (a’) deg(f) — 2 = 2L(f) + F(f), for each internal face f of G^. (b’) 

*s(/) + 2 = 2i(/) +V(/). 

3 Characterizing Upward Embeddings 

In this section we give a full characterization of the set of all upward embeddings 
of a general embedded planar graph tSection 1,3. I|l . This also implies a character- 
ization of all upward orientations of the given graph. Our characterization uses 
a flow model that is related to the one described in d for bipolar orientations. 
Also, we show how it is possible to add costs to our flow model in order to 
compute in polynomial time an upward orientation with the minimum number 
of sources and sinks 1 Section 13. 2B . 

3.1 A Flow Model for Characterizing Upward Embeddings 

The following theorem characterizes the class of labelings that are determined 
by all upward embeddings of an embedded planar graph. Observe that the char- 
acterization of such a class of labelings does not depend either on the choice of 
a splitting of the adjacency lists of the graph, in contrast to the result given in 
Lemma n or on the choice of an orientation of the graph. 

Theorem 1. Let C be any labeling of the angles of an embedded graph G^ with 
labels L, S, and F. C is the labeling determined by an upward embedding ofG^ 
if and only if the following properties hold: 

(a’) deg{f) —2 = 2L{f) + F{f), for each internal face f of G^j,. 

(b’) deg{f) + 2 = 2L{f) + F(f), for the external face f of G^. 

(c’) For each vertex v either F(v) = 2 and L(v) = 0 or F(v) = 0 and L(v) = 1. 

We call upward labeling of a labeling of the angles of G^ that verifles 
properties (a’), (b’), and (c’) of Theorem^ The result of Theorem ^ allows us 
to describe all upward embeddings of G^ by considering all upward labelings of 
G^. The proof of the theorem (see jHI) gives a method to construct the upward 
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embedding associated with an upward labeling. Actually, for each upward label- 
ing, there are exactly two “symmetric” upward embeddings that determine it; 
they are obtained one from the other by simply exchanging list Labove(v) with 
list Lbeiow{v) for each vertex v and then reversing such lists (see Figure 2] (b)). 

We now provide a network flow model that characterizes all the upward 
labelings of G^. Because of the above considerations, this flow model provides 
a characterization of all upward embeddings of G^. We associate with G^ a 
flow network A/^, such that the integer feasible flows on are in one-to-one 
correspondence with the upward labelings of G^. Flow network is a directed 
graph defined as follows (see Figure Oj): (i) The nodes of Afc/, are the vertices 
{vertex-nodes) and the faces {faee-nodes) of G^. Each vertex-node supplies flow 
2 and each face-node associated with face / of demands a flow equal to 
deg{f) — 2 if / is internal and deg{f) -I- 2 if / is external, (ii) For each angle 
of G^ at vertex v in face / there is an associated arc {v, /) of Af^ with lower 
capacity 0 and upper capacity 2. 





Fig. 3. (a) An embedded planar graph G^. (b) Flow network Af,p associated with G^. 
The vertex-nodes are circles and the face-nodes are squares. Each face-node is marked 
with its demand. The arcs of the networks are dashed. 

Observe that in Afrj, the total demand is equal to the total supply. In fact: 
E/6F - 2) + 4 = E/6F deg{f) - 2\F\ + 4 = 2|G| - 2|E| + 4 = 2|E|. 

The intuitive interpretation of the flow model in terms of upward embedding is 
as follows: (i) Each unit of flow represents a flat angle, with the convention that 
a large angle counts as two flat angles; an arc a of Afcj, has flow 0, 1, or 2, de- 
pending on the fact that its associated angle is small, flat, or large, respectively, 
(ii) The demand of each face-node and the supply of each vertex-node reflect 
the balancing properties (a’), (b’) and (c’). Figure Elshows a feasible flow on the 
network associated with an embedded planar graph, the corresponding upward 
labeling, and the two “symmetric” upward embeddings associated with the la- 
beling. Theorem Qformally proves the correctness of the intuitive interpretation 
above described. 

The flow network used in 0 to characterize the bipolar orientations of an 
embedded planar graph is tailored for biconnected planar graphs and captures 
only bipolar orientations. The values of the flow are not able to represent large 
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Fig. 4. (a) A feasible flow on the network associated with an embedded planar graph. 
Only the flow values different from zero are shown, (b) The upward labeling C, corre- 
sponding to the flow and the “symmetric” upward embeddings associated with L. 

angles (the flow values are only 0 or 1), except for the source and the sink of the 
orientation. Our flow network generalizes this network to represent any kind of 
upward orientations and embeddings, including the bipolar orientations. 

Theorem 2. Let be an embedded planar graph and let be the flow net- 
work assoeiated with G^. There is a one-to-one eorrespondenee between the set 
of the upward labelings of G^ and the set of the integer feasible flows on A/^. 

Theorem ^ and Theorem |2| allow us to compute an upward embedding of an 
embedded planar graph G^, by computing an integer feasible flow on network 
A/0. Denote by n the number of vertices of G^. Since network A/^ is planar and 
has 0{n) vertices, a feasible flow on A/^ can be computed in 0(n log n) time 
by applying a known maximum flow algorithm for planar networks fp. Also, 
both A/^ and an upward embedding associated with a feasible flow on A/^ can be 
constructed in linear time. Therefore, an upward embedding of any embedded 
planar graph can be constructed in 0(n log n) with the above technique. 

We remark that there are two main advantages of computing upward embed- 
dings of a general plane graph G^ by using the flow model described so far: No 
augmentation algorithms have to be used to initially biconnected the graph (we 
just apply a standard flow algorithm); it is possible to fix the flow on some arcs 
of the network to constrain the upward embedding to have a partially specified 
“shape” . For example, we can specify that an angle must be large in a certain 
face, or that some vertices must be neither sources nor sinks. In the next section 
we describe how to compute upward embeddings with the minimum number of 
sources and sinks, by adding costs to our network. 
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3.2 Minimizing Sources and Sinks 

Computing an upward embedding of with the minimum number of sources 
and sinks (which we call optimal upward embedding for simplicity) is equivalent 
to computing an upward embedding with the minimum number of large angles. 
Clearly, if the graph is biconnected, the problem is reduced to the computation of 
a bipolar orientation. For this reason, we regard the concept of optimal upward 
orientation as the natural extension of the definition of bipolar orientation to 
the case of general connected graphs. 

The flow model we use to compute an optimal upward orientation of 
is a variation of the one described for characterizing upward embeddings (see 
Section 13. 1 II . We add a linear number of arcs to network and we equip the 
arcs of the new network with costs. Each unit of cost represents a large angle. We 
also reduce the upper capacity of all the arcs of the network. More in detail, we 
define a network as follows: (i) The nodes of are again the vertices (vertex- 
node) and the faces (face-nodes) of G^. Each vertex-node again supplies flow 2 
and each face-node associated with face / of G^ again demands flow deg{f) — 2 
if / is internal and deg{f) -I- 2 if / is external, (ii) For each angle of G^ at vertex 
V in face / there is an associated pair of directed arcs = {v, f),a'^ = {v, f) in 
Both the arcs have lower capacity 0 and upper capacity 1. Also, arc has 
cost 0 while arc a(, has cost 1. 

In AT' we compute a minimum cost flow x. The interpretation of the flow 
in terms of upward labeling is similar to the one given for with a slightly 
variation due to the additional arcs and costs. We first observe that for each pair 
of arcs a„, a(, it never happens x(a^) = 0 and a;(a(,) = 1, due to the fact that 
the cost of Uy is 0 and the cost of a(, is 1. In fact, if x(ay) = 0 and x(a'y) = I, 
then there would exist a negative cost cycle represented by the two arcs a'y^ay, 
and it would be possible to derive a new flow x' from x by simply exchanging 
one unit of flow between a(, and Oy (i.e., x'{ay) = 1 and x'{a'^) = 0). This would 
imply that x' has a cost smaller than the cost of a;, in contrast to the assumption 
that X has the minimum cost. Hence, the only possibilities for the flow on arcs 
ay,a'y are: (i) x(uy) = x(a'y) — 0, the angle associated with arcs ay,a'y is small, 
(ii) x{ay) = 1 and x{a'y) = 0, the angle associated with arcs ay,a'y is flat, (iii) 
x(a„) = a:(o(,) = 1, the angle associated with arcs Uy, aj, is large. 

Note that, only in the third case we have cost 1 on arcs ay,a'y, while in the 
other two cases we have cost 0. This implies that the total cost of flow x on 
represents the total number of large angles of the corresponding upward embed- 
ding of G^. Hence, since x has the minimum cost, the corresponding upward 
embedding has the minimum number of large angles. 

Let n be the number of vertices of G^. Since network Af^ is planar and has 
0(n) vertices, and since its total demand (supply) is 0{n), a minimum cost flow 
on K can be computed in 0{n* logn) time by the algorithm described in p^ . 

We conclude this section by giving an upper bound on the number of sources 
and sinks of an optimal upward embedding. 

Lemma 3. An optimal upward embedding of an embedded planar graph G^f, has 
at most B+1 sources and sinks, where B is the number of blocks ofGtj,. Also, in 
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the optimal upward embedding each block contains at most one source and one 
sink. 

The bound of Lemma Q is strict and a class of plane graphs whose upward 
embeddings have B + 1 sources and sinks can be obtained by nesting each block 
into another, as shown by the example of Figure |5(a)[ 




Fig. 5. (a) A class of embedded planar graphs whose optimal upward embeddings have 
-B + 1 sources and sinks (circles), (b) A visibility representation of the upward embedded 
graph shown in Figure QJb). 



4 Algorithms for Visibility Representations 

We use the above results on upward embeddings to compute drawings of general 
connected planar graphs. Namely, we focus on graph drawing algorithms which 
require the computation of a (weak-) visibility representation of the input graph 
as a preliminary step 0. In a visibility representation (see Figure |5(b)| ), each 
vertex is mapped to a horizontal segment and each edge (u, v) is mapped to 
a vertical segment between the segments associated with u and u; horizontal 
segments do not overlap, and each vertical segment only intersect its extreme 
horizontal segments. 

A standard technique [B| to compute a visibility representation of a plane 
graph G consists of calculating a bipolar orientation of G; if G is not biconnected 
it is augmented to a biconnected planar graph by adding a suitable number 
of dummy edges, which will be removed in the final drawing. However, this 
technique has several drawbacks: (i) Adding too many dummy edges may lead 
to a final drawing with area much bigger than necessary. On the other side, 
the problem of adding the minimum number of edges to make a planar graph 
biconnected and still planar is NP-hard H3|. ( ii) Although an approximation 
algorithm for the above augmentation problem exists ^I3| (which reaches the 
optimal solution in many cases), implementing it efficiently is quite difficult, 
because it requires us to deal with the block tree of the graph and with an 
efficient incremental planarity testing algorithm. In fact, such an approximation 



348 



W. Didimo and M. Pizzonia 



algorithm has 0{n^T) running time, where T is the amortized time bound per 
query or insertion operation of the incremental planarity testing algorithm, (iii) 
The presence of dummy edges in the graph makes difficult to deal with partial 
assignments of the upward embedding. 

In jini it is sketched a strategy for computing visibility representations of 
general connected graphs; such a strategy does not explicitly detail how to 
perform some necessary topological and geometric operations. We propose the 
following algorithm for computing a visibility representation of a 1-connected 
embedded planar graph G^. 

Algorithm Visibility- Upward-Embedding 

1. Compute an upward embedding of G^, by calculating a feasible flow on 
network 

2. Compute an upward embedded st-graph including G^ and preserving 
on G^, by using the linear time saturation procedure described in jSj (note 
that, is biconnected) . 

3. Compute a visibility representation of (within its upward embedding) 
by using any known linear time algorithm |B|, and then remove the edges 
introduced by the saturation procedure. 

Algorithm Visibility-Upward-Embedding has 0(n log n) running time, be- 
cause its time complexity is dominated by the cost of computing a feasible flow 
on We experimentally observed that the area of the visibility representations 
produced by this algorithm can be dramatically improved by computing upward 
embeddings with the minimum number of sources and sinks. To do that we just 
apply a min-cost-flow algorithm in Step 1. Clearly, in this case, the running time 
of the whole algorithm grows to 0(nJ logn). The following theorem summarizes 
the main contributions of this section. 

Theorem 3. There exists an 0{ni logn) time algorithm that eomputes an up- 
ward embedding of an embedded 1-eonneeted planar graph with the minimum 
number of sources and sinks. 

4.1 Experimentation 

We present a preliminary study that shows how algorithm Visibility Upward 
Embedding can be slightly refined in order to get a certain control over the width 
and the height of visibility representations of 1-connected planar graphs. We 
start from the intuition that by re-arranging the blocks around the cutvertices 
in the upward embedding, it is possible to reduce the height or the width of 
the visibility representation. Namely, if u is a cutvertex and / a face which is 
doubly incident on v, placing all the blocks of v either above or below leads to a 
reduction of the height and to an increase in the width. Such re-arrangement is 
easily performed by exploiting the flow network associated with the plane graph. 
In such a network we can always move a unit of flow from an arc oi = (u, /) to 
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another arc 02 = (n, /) keeping the feasibility of the flow. Clearly, arcs oi and 
02 exist only if ?; is a cutvertex. 

The experimentation has been performed on a randomly generated test suite 
of 1820 graphs whose number n of vertices ranges from 10 to 100 (20 instances 
for each value of n). The graphs are planar and equipped with an embedding. 
Each graph of the test suite has a number of cutvertices between n/10 and n/5, 
the number of blocks attached to a cutvertex is between 2 and 5, the number 
of cutvertices of a block is between 1 and 5, and each biconnected component is 
generated by using the algorithm in [2 . A detailed description of the procedure 
can be found in ini. 

For each graph of the test suite, we first compute an upward embedding 
having the minimum number of sources and sinks that keeps the given embedding 
unchanged. We proceed by redistributing the flow around at most a number k of 
cutvertices in such a way to place all their blocks above or below them. Note that, 
such a redistribution can be easily done in linear time. We average the width 
and the height on all the graphs having the same number of vertices. Charts in 
Figure 0 graphically show the results of the experimentation for k ranging from 
0 to 8. It is interesting to observe how, for the same value of n and increasing fc, 
the average of the width increases while the average of the height decreases. 

Also, Figure Elcompares the area of the drawings computed with our strategy, 
where k is chosen equal to the total number of cutvertices of the graph, against 
the area of the drawings computed with a standard technique which uses the ap- 
proximation algorithm in m to initially make the graph biconnected. In the two 
strategies we use the same algorithm for producing the visibility representation 
from the st-graph. 



height Average 




Width Average 




(a) height (b) width 

Fig. 6. The charts show how re-arranging the blocks around cutvertices affects the 
width and the height of the visibility representation. 
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Fig. 7. Area of the drawings computed with our strategy against the area of the draw- 
ings computed with a standard technique based on a sophisticated augmentation algo- 
rithm (average values). The ai-axis represents the number of vertices. 



References 

1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows. In G. L. Nemhauser, 
A. H. G. Rinnooy Kan, and M. J. Todd, editors, Optimization, volume 1 of Hand- 
books in Operations Research and Management, pages 211-360. North-Holland, 
1990. 

2. P. Bertolazzi, G. Di Battista, and W. Didimo. Gomputing orthogonal drawings 
with the minimum numbr of bends. IEEE Transactions on Computers, 49(8), 2000. 

3. P. Bertolazzi, G. Di Battista, G. Liotta, and G. Mannino. Upward drawings of 
triconnected digraphs. Algorithmica, 6(12):476-497, 1994. 

4. P. Bertolazzi, G. Di Battista, C. Mannino, and R. Tamassia. Optimal upward 
planarity testing of single-source digraphs. SIAM J. Comput., 27(1):132-169, 1998. 

5. M. Bousset. A flow model of low complexity for twisting a layout. In Workshop of 
GD’93, pages 43-44, Paris, 1993. 

6. J. Gzyzowicz, A. Pelc, and I. Rival. Drawing orders with few slopes. Technical 
Report TR-87-12, Department of Gomputer Science, University of Ottawa, 1987. 

7. H. de Fraysseix, P. O. de Mendez, and P. Rosenstiehl. Bipolar orientations revisited. 
Discrete Appl. Math., 56:157-179, 1995. 

8. G. Di Battista, P. Fades, R. Tamassia, and I. G. Tollis. Graph Drawing. Prentice 
Hall, Upper Saddle River, NJ, 1999. 

9. W. Didimo and M. Pizzonia. Upward embeddings and orientations of undirected 
planar graphs. Technical Report RT-DIA-65-2001, University of Roma Tre, 2001. 

10. S. Fialko and P. Mutzel. A new approximation algorithm for the planar augmenta- 
tion problem. In Symposium on Discrete Algorithms (SODA ’98), pages 260-269, 
1998. 

11. A. Garg and R. Tamassia. On the computational complexity of upward and rec- 
tilinear planarity testing. In R. Tamassia and I. G. Tollis, editors. Graph Draw- 
ing (Proc. GD ’94), volume 894 of Lecture Notes Gomput. Sci., pages 286-297. 
Springer- Verlag, 1995. 

12. A. Garg and R. Tamassia. A new minimum cost flow algorithm with applications 
to graph drawing. In S. G. North, editor. Graph Drawing (Proc. GD ’96), volume 
1190 of Lecture Notes Comput. Sci., pages 201-216. Springer- Verlag, 1997. 

13. G. Kant and H. L. Bodlaender. Planar graph augmentation problems. In Proc. 
2nd Workshop Algorithms Data Struct., volume 519 of Lecture Notes Comput. Sci., 
pages 286-298. Springer- Verlag, 1991. 





Upward Embeddings and Orientations of Undirected Planar Graphs 351 



14. D. Kelly. Fundamentals of planar ordered sets. Discrete Math., 63:197-216, 1987. 

15. D. Kelly and I. Rival. Planar lattices. Canad. J. Math., 27(3):636-665, 1975. 

16. D. G. Kirkpatrick and S. K. Wismath. Weighted visibility graphs of bars and 
related flow problems. In Proe. 1st Workshop Algorithms Data Struct., volume 382 
of Lecture Notes Comput. Sci., pages 325-334. Springer- Verlag, 1989. 

17. M. Pizzonia. Engineering of Graph Drawing Algorithms for Applications. PhD 
thesis, Dipartimento di Informatica e Sistemistica, Universita “La Sapienza” di 
Roma, 2001. 

18. I. Rival. Reading, drawing, and order. In I. G. Rosenberg and G. Sabidussi, editors. 
Algebras and Orders, pages 359-404. Kluwer Academic Publishers, 1993. 

19. R. Tamassia and I. G. Tollis. A unified approach to visibility representations of 
planar graphs. Discrete Comput. Geom., 1(4):321-341, 1986. 

20. S. K. Wismath. Bar-Representable Visibility Graphs and Related Flow Problems. 
Ph.D. thesis. Dept. Comput. Sci., Univ. British Golumbia, 1989. 




An Approach for Mixed Upward Planarization’^ 



Markus Eiglsperger and Michael Kaufmann 

Universitat Tubingen, Wilhelm-Schickard-Institut fiir Informatik 
{eiglsper , mk}@inf ormat ik . uni-tueb ingen . de 



Abstract. In this paper, we consider the problem of finding a mixed 
upward planarization of a mixed graph, i.e., a graph with directed and 
undirected edges. The problem is a generalization of the planarization 
problem for undirected graphs and is motivated by several applications in 
graph drawing. We present a heuristical approach for this problem which 
provides good quality and reasonable running time in practice, even for 
large graphs. This planarization method combined with a graph drawing 
algorithm for upward planar graphs can be seen as a real alternative to 
the wellknown Sugiyama algorithm. 



1 Introduction 

Research for upward drawings of digraphs has been studied extensively in the 
last years. One reason is that such drawings have many applications in areas like 
workflow, project management and data flow. 

An upward drawing of a digraph is a drawing such that all the edges are 
represented by curves monotonically increasing in the vertical direction. Note 
that such a drawing exists only if the digraph is acyclic. 

A straightforward generalization of upward drawings are mixed upward draw- 
ings. In mixed upward drawings, only a part of the edges in the graph are directed 
und must point upward. Note that such a drawing exists only if the directed part 
of the graph is acyclic. 

Mixed drawings arise in applications where the edges of the graph can be 
partioned into a set which denotes structural information and a set which does 
not carry structural information. An example is UML class diagrams 0 arising 
in software engineering. In these diagrams, the vertices of the graph represent 
classes in an object-oriented software system, and edges represent relations be- 
tween these classes. There are two main types of relations: generalizations and 
associations. The generalization relations describe structural information and 
form a directed acyclic subgraph in the diagram. It is an often employed con- 
vention to draw generalizations upward, whereas associations can have arbitrary 
directions [LiiJ. 

The most popular approach for creating upward drawings of digraphs is prob- 
ably the Sugiyama algorithm m- The main idea of the Sugiyama algorithm is 
to assign layers to the vertices of the graph, such that edges point in ascending 
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layer order. In a next step, the number of crossings are minimized by ordering 
the nodes in the layer. For a fixed layer assignment, we call a graph level planar 
if it has a drawing which respects the layering and has no crossings. Several 
heuristics are proposed for this step and are used in practice, but there are also 
efficient algorithms to solve the level planarity problem mm- There have been 
several attempts to apply the Sugiyama algorithm also to mixed graphs, i.e., in 
m the approach is used for UML class diagrams. 

The principal step of the Sugiyama algorithm, the layer assignment, is also 
its most severe drawback. The layer assignment restricts the freedom of choice 
for the crossing minimization algorithm drastically, and there may be large dif- 
ferences between the number of crossings for different layer assignments of one 
graph. Also, the generalizations of the Sugiyama algorithm for the mixed case 
have to assign layers to nodes with no directed adjacent edges. This only works 
when there is a low number of them, but if the directed part of the mixed graph 
is only small, the results are not satisfying and the layer assignment to the nodes 
seems artificial. 

We propose in this work a different drawing strategy for upward drawings of 
directed graphs which is based on the concept of upward planarity. A directed 
graph is upward planar if it can be drawn upward without edge crossings. Our 
strategy consists of two phases. In the first phase, we make the input graph 
upward planar by replacing edge crossings by dummy nodes. We call the result of 
this phase upward planarization. In the second phase, an upward planar drawing 
of the upward planarization is generated and the dummy nodes are discarded. 

A similar strategy has been applied very successfully in the area of draw- 
ing undirected graphs. The most popular algorithms based on this strategy are 
perhaps the graph drawing algorithms descending from the GIOTTO approach 
Eipni for orthogonal drawings. In |S| we showed recently how to extend the 
GIOTTO approach to mixed upward planar drawings of mixed upward planar 
graphs, i.e., mixed graphs which have a planar drawing in which the directed 
part of the graph is drawn upward. The above strategy can also be applied to 
this algorithm, the first phase then consisting of finding a mixed upward pla- 
narization. 

In the remainder of this paper, we concentrate on the first phase of the 
strategy, see 0 for a survey on graph drawing algorithms for upward planar 
graphs. We give an efficient heuristics that computes a high quality upward 
planarization of a directed graph. We concentrate on a heuristical approach, 
since the upward planarity test problem is already NP-complete. This is the 
first time that this problem is studied; work on planarization has been restricted 
to the undirected case until now. We give also a generalization of our algorithm 
for mixed graphs. 

We want to emphasize that the GIOTTO framework above is only one pos- 
sible application of the new algorithm. It probably deserves also attention as a 
stand-alone product which might be applicable in other environments. 

The rest of the paper is organized as follows. Section 2 gives the formal defi- 
nitions of the upward and the mixed upward planarization problem. In Section 3, 
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we present an algorithm which solves the upward planarization problem. Finally, 
we show how the results of Section 3 can be generalized to the mixed case. 



Fig. 1. A comparison of the Sugiyama-approach (seven crossings) and our new method 
(two crossings) applied to an upward planar graph (left figure). 



2 Upward and Mixed Upward Planarization Problem 

A drawing of a graph (digraph) is a mapping of its nodes to points in the 
plane and of its edges to open Jordan curves. A graph (digraph) is planar if it 
has a drawing where no two edges have a common point. An upward drawing 
of a digraph is a drawing such that all the edges are represented by curves 
monotonically increasing in the vertical direction. A digraph is upward planar 
if it has a drawing which is upward and planar at the same time. Please note 
that there are graphs which have an upward drawing and also have a planar 
drawing, but do not have an upward planar drawing. An embedding of a graph is 
defined as a cyclic ordering of the adjacent edges of each vertex of the graph. An 
embedding is planar if there is a planar drawing of the graph which preserves this 
ordering. An upward embedding of a graph is a linear ordering of the adjacent 
edges of each vertex of the graph in which the incoming and outgoing edges 
form an interval. An upward embedding is planar if there is an upward planar 
drawing of the graph which preserves the corresponding ordering. Preserving 
the ordering means that the linear ordering is equivalent to the ordering that 
can be obtained by ordering the edges according to the angle they form with 
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a ray leaving the vertex in direction of the negative x-axis. We assume in the 
remainder of the paper that graphs have no multiple edges and selfloops. 

Given a directed graph G = {V, E), the graph G' = (U U V , E') is an upward 
planarization of G with crossing number \V'\ if and only if 

— G' is upward planar, 

— deg{v) = 4 for all v £ V , and 

— Ve = (v, w) £ E, there is a path p{e) = (vq, vi), (vi,V 2 ), ■ ■ ■ , (vn-i,Vn) in G' 

with V = Vq, w = Vn and Vi £ V' ,0 < i < n. Every edge in E' is contained 

in such a path, and two paths have no edge in common. 

A mixed graph is a three-tuple G = {V, Ed, E^) C {V,V x V,V x V), where 
V is the set of vertices, Ed is the set of directed edges and is the set of 
undirected edges. 

The mixed graph G = (V, Ed, E^) is mixed upward planar if there is a planar 
drawing of G where each edge in Ed is represented by a curve monotonically 
increasing in the vertical direction. 

A mixed upward embedding is planar if there is a mixed upward planar 
drawing of the graph which preserves the corresponding ordering. 

Determining for a (mixed) graph G a (mixed) upward planarized graph is 
called the (mixed) upward planarization problem. Determining for a (mixed) 
graph G a (mixed) upward planarized graph with minimal crossing number is 
called the (mixed) upward crossing minimal problem. 

Because determining whether a graph has an upward embedding is a special 
case of the upward crossing minimization problem and the mixed case is a special 
case of the directed case, it follows that: 

Corollary 1 (nni). The upward crossing minimization and the mixed upward 
crossing minimization problem are NP-hard. 

3 Upward Planarization 

In this section, we propose an algorithmic framework for the upward crossing 
minimization problem. This framework is derived from techniques for the pla- 
narization of undirected graphs, see i.e. [Zj. 

The framework consists of three parts: 

1. Construct upward planar subgraph. 

2. Determine upward embbeding of this subgraph. 

3. Insert edges not contained in the subgraph, one by one. 

In the first step, a subgraph of the input graph is calculated which is upward 
planar. For this subgraph, an upward embedding is determined in the second 
step. Of course, these two steps are only conceptually separated and can be 
combined to one step. Note that finding a maximum upward planar subgraph, 
i.e., finding an upward planar subgraph with the maximum number of edges, is 
NP-hard. In the third step, the edges which are not part of the upward planar 
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subgraph are inserted incrementally into the embedding. Additionally, we can 
perform some local optimizations on the resulting planarization to improve the 
quality of it. 



3.1 Maximum Upward Planar Subgraph 

The maximum upward planar subgraph problem can be stated as follows: Given a 
directed graph G = (V, E). Find E' C E such that the directed graph G = {V, E) 
is upward planar with maximum number of edges. 

The maximum planar subgraph problem is a related problem and a lot of 
algorithms have been proposed for its solution All of them, 

except HSl, which can also compute the optimal solution when no time limit is 
specified, are heuristics, since the problem is NP-complete. Cimikowski p] com- 
pared some of them empirically. In his comparison, the algorithm of Jiinger and 
Miitzeh.TTVnpi 5] performed best in solution quality, followed by the algorithm of 
Goldschmidt and Takvorian (GT)fH. The fastest algorithm was the one based 
on PQ-treesfl 6|. but its performance in terms of the solution quality was signif- 
icantly lower than JM and GT. Resende and Ribero give in HH a randomized 
formulation of GT and show on the same test set as 0 that their formulation 
achieves better results with the same running time performance, except for one 
family of graphs where JM performs better. 

However, the algorithm of GT is much easier to implement in contrast to the 
algorithm of JM. JM is a branch and cut algorithm and is, therefore, based on 
sophisticated algorithms for linear programming. 

Because of its performance and its implementation issues, we use GT as a 
starting point. In the next section, we review the GT algorithm and show in the 
following section how it can be modified to calculate upward planar embeddings. 



3.2 The Goldschmidt/Takvorian Planarization Algorithm 

In this section, we review the main components of GT, the two-phase heuristic 
of Goldschmidt and Ta.kvoria.n jl l|. Our description follows the one in The 
first phase of GT consists in devising an ordering II of the set of vertices of V of 
the input graph G. This ordering should possibly infer a Hamiltonian path. The 
vertices of G are placed on a vertical line according to the ordering 77 obtained 
in the first phase, such that as many edges as possible between adjacent vertices 
can also be placed on the line. All other edges are drawn as arcs either right or 
left of the line. 

The second phase of GT partitions the edge set E of G into subsets £ (left of 
the line), H (right of the line), and B (the remainder) in such a way that \C + Ii\ 
is large (ideally maximum) and that no two edges both in £ or both in IZ cross 
with respect to the sequence 77 devised in the first phase. 

Let 7t(u) denote the relative position of vertex v £V within vertex sequence 
77. Furthermore, let ei = (a, b) and 62 = (c, d) be two edges of G, such that, 
without loss of generality, 7 r(a) < 7t(6) and 7t(c) < Tr(d). These edges are said 
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to cross if, with respect to sequence 7T, 7r(a) < 7 t(c) < 7t(6) < Tr{d) or 7t(c) < 

7r(a) < Tr{d) < 7t(&). 

The conflict graph has a vertex for every edge in G and two vertices are 
adjacent if the corresponding edges cross with respect to U. It follows directly 
from its definition that the conflict graph is an overlap graph, i.e. a graph whose 
vertices can be represented as intervals, and two vertices are adjacent if and only 
if the corresponding intervals intersect but none of the two is contained by the 
other. 

A induced bipartite subgraph of the conflict graph represents a valid assign- 
ment of the edges in G to the sets C,TZ and B. Since finding a maximal induced 
bipartite subgraph is NP-complete, even for overlap graphs, GT uses a heuris- 
tics. This heuristic calculates two disjoint independent sets of the conflict graph 
which, together, are a bipartite subgraph of the conflict graph. 

A maximum independent set of an overlap graph can be calculated in time 
0{NM), where N is the number of different interval endpoints and M is the 
number of edges in the overlap graph by the algorithm of Asano, Imai and 
MukaiyamaP^- In our setting, N < n and M < mf, which leads to a running 
time of 0{nm?). 

3.3 The Direct Version of the GT Algorithm 

We now present our variant of the GT Algorithm for planar upward subgraph 
calculation. In order to change the GT algorithm to get a upward planar sub- 
graph, we have to modify the first step of GT, the construction of the vertex 
order. The vertex order must ensure that no directed edge has a target vertex 
which is in the order before the source vertex. This is achieved by using algorithm 
vertex order as a first phase of GT. We call this variant directed GT or shorter 
DGT to distinguish it from the original formulation. The algorithm vertex or- 
der is a modification of the algorithm El and ensures this. It is a variation of 
a topological sorting algorithm, and constructs the ordering incrementally. As- 
sume that vertex v is the vertex chosen in the last step. The algorithm chooses 
a vertex in the next step which is adjacent to v, but which is not the successor 
of a unchosen vertex. If this is not possible, it takes a vertex of minimal degree 
which, additionally, is not the successor of a unchosen vertex. As the first vertex, 
it chooses a vertex with no incoming edge with minimal degree. 

Lemma 1. Let G be a directed graph. If the vertex order II in the first phase 
of the GT algorithm is a topological ordering of G, the result of GT is a upward 
planar subgraph of G. 



Lemma 2. The vertex order calculated by algorithm vertex order is a topological 
ordering ofG. 

From the sets C and TZ and the permutation II, we can now easily obtain the 
upward planar embedding. The details are omitted because of space limitations. 
We conclude the section with the following theorem: 
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Algorithm 1: vertex order 

Input: A directed graph G = (V^ E) 

Output: A permutation 77 on the vertices 
Select from G with minimal degree; 

V = l/\{m}; 

Gi = directed graph induced on G by V; 

for fc = 2,...,|l/| do 

77 = {u € V|w ist not target of an directed edge in Gfc}; 
if Vk-i is connected to a vertex inU then 

select Vk as vertex with minimal degree in 77 
else 

select Vk as vertex in 77 adjacent to Vk-i with minimal degree in Gfc_i 

end 

V = V\vk-, 

Gk = directed graph induced on G by V; 

end 

return 77 = {vi,V2, ■ ■ ■ , i>|v|) 



Theorem 1. Algorithm DGT computes an upward planar subgraph, together 
with an upward planar embedding of this subgraph, in time 0{nmf). 

3.4 Edge Insertion 

There is an interesting difference between the insertion of directed and undirected 
edges. In the undirected case, the edges which are not part of the planar subgraph 
in the first step can be inserted independently of each other. This is different 
in the directed case. Here, we cannot insert an edge into the drawing without 
looking at the remaining edges which have to be inserted later. The reason for 
this is that introducing dummy nodes in the graph introduces changes in the 
ordering of the vertices of the graph. This may introduce directed cycles if an 
edge is added later. 

Assume that the dashed edges have to be inserted in Fig. 2(a), and we start 
by inserting edge (5,9). When we do not work carefully and insert edge (5,9) 
as in Fig. 2(b), we produce a crossing C with edge (1,3) and some new edges, 
where C is involved. Then, it is no longer possible to introduce edge (3, 4) without 
destroying the upwardness property because of the new directed cycle b — C — 
3-4-5. 

We call a vertex with indegree 0 a source, and a vertex with outdegree 0 
a sink. A directed graph is called an s-t graph if it has exactly one sink and 
one source. We first restrict ourselves to s-t graphs. We show later how we can 
remove this restriction. 

As shown above, we have to avoid cycles when we insert edges. We avoid 
this by layering the graph. A valid layering I of a directed graph G = (V, E) 
is a mapping of V to integers such that l(v) > l(u) for each edge (u,v) G E. 
We then construct a routing graph R. The routing graph contains, for each face 
/ and for each layer that / spans, a vertex. Two vertices laying in neighbored 
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Fig. 2. Edge insertion: Critical configuration and the routing graph 



layers and representing the same face are connected by a directed edge of weight 
0 in increasing layer order. Additionaly, two vertices at the same layer i of 
adjacent faces are connected by an edge of weight 1 if the source vertex of 
an edge separting this two faces is less than or equal to i and the layer of the 
target node is greater than i. 

In this graph, there are no edges in decreasing layer order. Edges of weight 1 
represent one crossing. A shortest path in the routing graph represents, therefore, 
an insertion of an edge with minimal number of crossing with respect to the given 
layering. Figure 2(c) shows an example for a routing graph. 

Let s(/), resp. t(/), denote the source, resp. sink, of a face /. Note that in 
a s-t graph, every face has exactly one source and one sink. Furthermore, lf{e), 
resp. r/(e), denotes the face on the left, resp., right side of e. We consider the 
outer face as two faces, the left- outer face and the right-outer face. The left-outer 
face denotes the left part of the outer face, the right-outer face the right part. 
The algorithm edge insertion summarizes the construction. 

Lemma 3. The graph G' calculated in edge insertion is an s-t graph and upward 
planar. 

Proof. In the edge insertion step, we do not decrease the indegree or the outde- 
gree of any vertex existing already in the input graph. Therefore, we only have 
to show that none of the inserted vertices is a sink or a source. But this is true, 
since each of these vertices has indegree two and outdegree two. G' is upward 
planar, since there are no crossings and the layering is conserved. 

Lemma 4. The graph (W, E' U F \ e) is acyclic. 

Proof. Assume that there is a directed cycle. Each inserted vertex Wi is induced 
by an edge of weight 1 in the routing graph which connects two face vertices 
laying in the same layer. Assign this layer to node Wi. To each vertex a layer is 
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Algorithm 2: edge insertion 

Input: Embedded upw. planar s-t graph G = {V, E), F C V xV , e = (a,b) ^ F 
Output: Embedded upward planarized s-t graph G' of G = (V,E U e) 
calculate faces from embedding; 
determine valid layering I of {V, E U F)-, 
for every face f of G do 

create vertices v{f,i) for l{s{f)) <i < l(t{f))\ 

create edge of weight 0 from v{f,i — 1) to v{f,i) for l{s{f)) <i < l{t{f)) 

end 

for every edge e' = (c, d) of E do 

create edge of weight 1 from v{lf{e'),i) to v{r f{e'),i) and the reverse for 
l{c) < i < l{d)\ 

end 

create vertex v{a) and v{b) representing a resp. b\ 

Insert edge of weight 0 from v(a) to v(f, 1(a)) if / is adjacent to a and such a 
vertex exists; 

Insert edge of weight 0 from v(b) to v(f, 1(b) — 1) if / is adjacent to b and such 
a vertex existst; 

Calculate shortest path p from v(a) to v(b) in R\ 

E' = E,V' = V,G' = (V' , E')\ Let eo, . . . , e„ be the edges of weight 1 in p; 
Subdivide Ci with a vertex Wi, 0 < i < n in G'; 

Add an edge between a and wo, Wn and b, and Wi and Wi+\ in E' ■, 

return G' 



assigned, and there are no edges which point in decreasing layer order. Thus, the 
cycle can only contain vertices in the same layer. These can only be vertices Wi 
by the construction of the layering. But, from this fact follows that the shortest 
path had a directed cycle which is a contradiction. 



Lemma 5. Algorithm edge insertion has time complexity 0(|l^p), 

Proof. The faces of the graph can be computed in linear time from the embed- 
ding. A valid layering with a minimal number of layers can also be computed in 
linear time using a topological sorting. The maximum number of layers is linear, 
since a topological sorting is an upper bound for the number of layers. Hence, 
the number of vertices in the routing graph is 0(|yp) and, since each vertex 
has constant degree, the total size of the routing graph is 0(\V\'^). Because the 
maximum cost of an edge is 1, we can use Dials shortest path algorithm jS] 
which has linear running time in this case. The insertion of the edge can clearly 
be done in linear time. 

The following theorem summarizes the lemmas above. 

Theorem 2. Algorithm edge insertion inserts one edge in an embedded upward 
planar s-t graph G = (V,E) in time 0(\V\'^) without introducing cycles with a 
set of not yet inserted edges. The planarized graph is an s-t graph. 
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3.5 The Complete Algorithm 

Algorithm upward planarization contains a description of the algorithm. Note 
that if the input graph for the second phase is not an s-t graph, we augment it 
to a planar upward s-t graph, see P| for a linear time algorithm. Edges in the 
routing graph representing an edge added in the augmenting step are assigned 
weight 0, because they do not introduce a real crossing. After the routing, the 
augmenting edges are removed. Note that the augmentation does not affect the 
worst-case running time of the algorithm, since the number of edges in the graph 
remains linear in the number of nodes. 



Algorithm 3: upward-planarization 

calculate embedded mixed upward planar subgraph with DGT; 
augment subgraph to an s-t graph; 
for Each directed edge not in subgraph do 
call algorithm directed edge insertion; 
end 

remove edges inserted in augmentation process; 
for Each undirected edge not in subgraph do 
call algorithm undirected edge insertion; 

end 



From the discussion above, we derive the following theorem: 

Theorem 3. Let G = {V,E) be a direeted graph. Algorithm upward- 

planarization creates an embedded upward planarized graph of G in time 
0(|U||ifp -I- (|U| -I- c)'^\E\), where c is the number of crossings of the planarized 
graph. When G is sparse, i.e. \E\ = 0(|U|), the aZgorzt/im upward-planarization 
runs in time 0(|Up). 

However, the time bound in the theorem above is very pessimistic. In our 
experiments, the running time of the algorithm is reasonable, even for larger 
graphs. 

3.6 Rerouting 

In this section, we present a local optimization method for an upward planariza- 
tion. This method removes a path representing an edge from the planarization. 
Then, we augment the resulting graph to an s-t-graph. For the removed edge, we 
try to find a better routing. If we succeed, we change the planarization according 
to the new routing. Otherwise, we do not change the planarization. In any case, 
the augmented edges are removed. We stop this local optimization when either 
no further improvements are made or a maximal number of iterations has been 
performed. 

There is one advantage of rerouting with respect to routing: we do not have 
to observe that some edges will be inserted later and might cause cycles. We, 
therefore, do not need the layering concept here. 
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We now define the routing graph for the rerouting of an edge from a to b. 
Note that we assume that the outer face is split into two faces, one for each 
side. The routing graph contains a start node s and a end node t. Each face /, 
has for each adjacent edge e, a node v{f,e). We call these nodes edge nodes. It 
has, additionaly, two nodes l{f) and r{f), except for the left outer face which 
has only node r(/), and the right outer face which has only node l{f). We call 
those nodes face nodes. For each edge e, the two corresponding edge nodes are 
connected by a directed edge of weight 1, except for edges introduced in the 
augmentation step. In this case, the weight is 0. If an edge Ci is directly below 
an edge 62 on a face /, the routing graph contains a directed edge v{f,ei) to 
vif, 62) with weight 0. For a face / and a node v representing an edge on the 
left, resp. right, side of /, there is a directed edge of cost 0 from v to r(/), resp. 
/(/). There is a directed edge of weight 0 from the start node s to each node in 
the routing graph representing an outgoing edge of a. There is a directed edge 
of weight 0 from each node in the routing graph representing an incoming edge 
of b to the end node t. 

Lemma 6. The routing graph has linear size with respect to the planarization. 

Proof. The number of nodes in the routing graph is 2|£'| + 2|F| + 2, and it is 
therefore, linear in the number of nodes of the planarization by Euler’s formula. 
Each edge node in the routing graph is adjacent to at most eight edges. Because 
face nodes are only connected to edge nodes in the routing graph but not to 
other face nodes, the number of edges is linear in the number of nodes in the 
routing graph. 

The methods and observations on the subsection ’Edge Insertion’ apply also 
to the rerouting, only the routing graph is different. Therefore, we can conclude 
with the following theorem: 

Theorem 4. Rerouting of an edge e G E of a graph G = (V, E) in a planariza- 
tion G'{V\E') of G can be done in time 0{\V'\). 

4 Mixed Upward Planarization 

In this section we show how the concepts in the previous sections can be extended 
to the mixed case, i.e., the input graph is a mixed graph. 

For the mixed planar subgraph calculation, we also use the GT algorithm. 
As in the upward case, we only have to take care of the vertex ordering. We use 
a modified version of the vertex-ordering algorithm for this, which ignores the 
direction of the undirected edges. The changes are omitted here, because they 
add no major additional insight in the algorithm. They can be found in [3. 

It is clear that the directed edges are more restrictive than the undirected. So, 
it is intuitive to prefer the directed edges when computing the planar subgraph. 
One variant of the planar subgraph algorithm takes this aspect into account. 
It extends the GT approach by assigning different weights to the directed and 
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undirected edges and then optimizes over the weighted sum of the edges P . The 
actual choice of the weights depends on the application, as well as on the class 
of graphs considered and is the subject of further research. 

The upward edge insertion algorithm can be extended to the mixed case by 
directing the undirected edges in G temporarly according to the ordering in GT. 
We then insert the directed edges iteratively in the graph as described above. 
Next, we undirect the temporarily directed edges. Finally, we insert the undi- 
rected edges by an standard edge insertion algorithm for undirected graphs [Z|. 

Theorem 5. Let G = (V, Ed, E^) be a mixed graph. Algorithm mixed-upward- 
planarization creates an embedded mixed upward planarized graph of G in time 
0{\V\{\Ed\ + \Eu\)'^ + (|F| -I- c)'^\Ed\ + (|U| -I- c)\Eu\), where c is the number of 
crossings in the planarized graph. 
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Abstract. Hannenhalli and Pevzner gave the first polynomial-time al- 
gorithm for computing the inversion distance between two signed permu- 
tations, as part of the larger task of determining the shortest sequence 
of inversions needed to transform one permutation into the other. Their 
algorithm (restricted to distance calculation) proceeds in two stages: in 
the first stage, the overlap graph induced by the permutation is de- 
composed into connected components, then in the second stage certain 
graph structures (hurdles and others) are identified. Berman and Han- 
nenhalli avoided the explicit computation of the overlap graph and gave 
an 0{na{n)) algorithm, based on a Union-Find structure, to find its con- 
nected components, where a is the inverse Ackerman function. Since for 
all practical purposes a{n) is a constant no larger than four, this algo- 
rithm has been the fastest practical algorithm to date. In this paper, we 
present a new linear-time algorithm for computing the connected com- 
ponents, which is more efficient than that of Berman and Hannenhalli in 
both theory and practice. Our algorithm uses only a stack and is very 
easy to implement. We give the results of computational experiments 
over a large range of permutation pairs produced through simulated evo- 
lution; our experiments show a speed-up by a factor of 2 to 5 in the 
computation of the connected components and by a factor of 1.3 to 2 in 
the overall distance computation. 



1 Introduction 

Some organisms have a single chromosome or contain single-chromosome 
organelles (such as mitochondria or chloroplasts) , the evolution of which is 
largely independent on the evolution of the nuclear genome. Given a particular 
strand from a single chromosome, whether linear or circular, we can infer the 
ordering and directionality of the genes, thus representing each chromosome 
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by an ordering of oriented genes. In many cases, the evolutionary process that 
operates on such single-chromosome organisms consists mostly of inversions of 
portions of the chromosome; this finding has led many biologists to reconstruct 
phytogenies based on gene orders, using as a measure of evolutionary distance 
between two genomes the inversion distance, i.e., the smallest number of 
inversions needed to transform one signed permutation into the other pilliljOj . 

Both inversion distance and the closely related transposition distance are 
difficult computational problems that have been studied intensively over the last 



five years 






Finding the inversion distance between unsigned 
permutations is NP-hard 0, but with signed ones, it can be done in polynomial 
time HH. The fastest published algorithm for the computation of inversion 
distance between two signed permutations has been that of Berman and 
Hannenhalli jS|, which uses a Union-Find data structure and runs in 0{na{n)) 
time, where a{n) is the inverse Ackerman function. (The later KST algorithm 
m reduces the time needed to compute the shortest sequence of inversions, 
but uses the same algorithm for computing the length of that sequence.) We 
have found only two implementations on the web, both designed to compute 
the shortest sequence of inversions as well as its length; one, due to Hannenhalli 
implements his first algorithm E], which runs in quadratic time when 
computing distances, while the other, a Java applet written by Mantin H3|, 
a student of Shamir, implements the KST algorithm H2|, but uses an explicit 
representation of the overlap graph and thus also takes quadratic time. 

We present a simple and practical, worst-case linear-time algorithm to 
compute the connected components of the overlap graph, which results in a 
simple linear-time algorithm for computing the inversion distance between two 
signed permutations. We also provide ample experimental evidence that our 
linear-time algorithm is efficient in practice as well as in theory: we coded it 
as well as the algorithm of Berman and Hannenhalli, using the best principles 
of algorithm engineering to ensure that both implementations would be 

as efficient as possible, and compared their running times on a large range of 
instances generated through simulated evolution. (The two implemenations on 
the web are naturally far slower.) 

The paper is organized as follows. We begin by recalling some definitions, 
briefly review past work on sorting by reversals, then introduce the concepts 
that we will need in our algorithm, including the fundamental theorem that 
makes it possible. We then describe and analyze our algorithm, discuss our 
experimental setup, present and comment on our results, and briefly mention 
an application of our distance computation in a whole-genome phylogeny study. 



2 Inversions on Signed Permutations 

We assume a fixed set of genes {gi,g 2 , ■ ■ ■ , 5n}- Each genome is then an ordering 
(circular or linear) of these genes, each gene given with an orientation that 
is either positive (gi) or negative {—gi)- The ordering i <7nj whether 

linear or circular, is considered equivalent to that obtained by considering the 
complementary strand, i.e., the ordering —gn, — ffn-i, • ■ ■ , —gi- 
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Let G be the genome with signed ordering (linear or circular) 51 , 32 , 

An inversion between indices i and j, for i < j, produces the genome with 
linear ordering 



■ ■ ■ T 9i—li 9j: — l,-'', 9ii 9j+^i ■ ■ ■ 1 9n 

If we have j < i, we can still apply an inversion to a circular (but not linear) 
genome by rotating the circular ordering until the two indices are in the proper 
relationship — recall that we consider all rotations of the complete circular 
ordering of a circular genome as equivalent. 

The inversion distance between two genomes (two signed permutations of 
the same set) is then the minimum number of inversions that must be applied 
to one genome in order to produce the other. (This measure is easily seen to be 
a true metric.) Computing the shortest sequence of inversions that gives rise 
to this distance is also known as sorting by reversals — we shall shortly see why 
it can be regarded as a sorting problem. 

3 Previous Work 

Bafna and Pevzner introduced the cycle graph of a permutation j2j, thereby 
providing the basic data structure for inversion distance computations. Hannen- 
halli and Pevzner then developed the basic theory for expressing the inversion 
distance in easily computable terms (number of breakpoints minus number 
of cycles plus number of hurdles plus a correction factor for a fortress |3 
E2] — hurdles and fortresses are easily detectable from a connected component 
analysis). They also gave the first polynomial-time algorithm for sorting signed 
permutations by reversals El; they also proposed a 0(n^) implementation of 
their algorithm cni, which runs in quadratic time when restricted to distance 
computation. Their algorithm requires the computation of the connected 
components of the overlap graph, which is the bottleneck for the distance 
computation. Berman and Hannenhalli later exploited some combinatorial 
properties of the cycle graph to give a 0(na{n)) algorithm to compute the 
connected components, leading to a 0(n^a{n)) implementation of the sorting 
algorithm |Hj. (We will refer to this approach as the UF approach.) Algorithms 
for finding the connected components of interval graphs (a class of graphs that 
include the more specialized overlap graphs used in sorting by reversals) that 
run in linear time are known, but they use range minima and lowest common 
ancestor data structures and algorithms so that, in addition to being complex 
and hard to implement, they suffer from high overhead — high enough, in fact, 
that the UF approach would remain the faster solution in practice. 

4 Overlap Graph and Forest 

Given a signed permutation of {1, . . . , n}, we transform it into an unsigned per- 
mutation 7T of {!,..., 2n} by substituting the ordered pair (2x — 1, 2x) for the 
positive element x and the ordered pair {2x,2x — 1) for the negative elements 
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—X, then extend tt to the set {0, 1, . . . , 2n, 2n + 1} by setting 7r(0) = 0 and 
7r(2n+ 1) = 2n+ 1. By convention, we assume that the two signed permutations 
for which we must compute a distance have been turned in this manner into 
unsigned permutations and then both permuted (by the same transformation) 
so that the first permutation becomes the linear ordering (0,1 ,..., 2n, 2n + 1); 
these manipulations do not affect the distance value. (This is the reason why 
transforming one permutation into the other can be viewed as sorting — we want 
to find out how many inversions are needed to produce the identity permutation 
from the given one.) We represent an extended unsigned permutation with an 
edge-colored graph, the cycle graph of the permutation. The graph has 2n+2 ver- 
tices; for each i, 0 < i < n, we join vertices Tr(2i) and 7r(2i-|-l) by a gray edge and 
vertices 2i and 2t-|- 1 by a black edge, as illustrated in Figure Hla). The resulting 
graph consists of disjoint cycles in which edges alternate colors; we remove from 
it all 2-cycles (because these cycles correspond to portions of the permutation 
that are already sorted and cannot intersect with any other cycles). We say that 



A 




i 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 



Jt 0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23 

+3 +9 -7 +5 -10 +8 44 -6 +11 +2 +1 

(a) the signed permutation, its unsigned extension, and its cycle graph 
(note that the gray edges appear as dashed arcs) 




(b) the overlap graph (c) the overlap forest 



Fig. 1. The signed permutation (-1-3, -1-9, —7, -1-5, —10, -1-8, -1-4, —6, -fll, +2, -|-1) and its 
various representations 



gray edges (7r(i), 7r(j)) and (7r(fc), 7r(t)) overlap whenever the two intervals [i,j] 
and [k,f\ overlap, but neither contains the other. Similarly, we say that cycles 
Cl and C 2 overlap if there exist overlapping gray edges ei G Ci and 62 G C2. 

Definition 1. The overlap graph of permutation tt has one vertex for each cy- 
cle in the cycle graph and an edge between any two vertices that correspond to 
overlapping cycles. 
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Figure ^illustrates the concept. The extent of cycle C is the interval [C.B, C.E], 
where we have C.B = min{z | 7r(i) S C} and C.E = max{i | 7r(i) S C}. The 
extent of a set of cycles {Ci, . . . , C^} is [B,E], with B = Ci.B and 

E = maxJ^j^ Ci-E. In Figured the extent of cycle A is [0,21], that of cycle F 
is [18,23], and that of the set {A,F} is [0,23]. 

No algorithm that actually builds the overlap graph can run in linear time, 
since that graph can be of quadratic size. Thus, our goal is to construct an 
overlap forest such that two vertices / and g belong to the same tree in the 
forest exactly when they belong to the same connected component in the 
overlap graph. An overlap forest (the composition of its trees is unique, but 
their structure is arbitrary) has exactly one tree per connected component of 
the overlap graph and is thus of linear size. 

5 The Linear-Time Algorithm for Connected Components 

Our algorithm for computing the connected components scans the permutation 
twice. The first scan sets up a trivial forest in which each node is its own tree, 
labelled with the beginning of its cycle. The second scan carries out an iterative 
refinement of this first forest, by adding edges and so merging trees in the forest; 
unlike a Union-Find, however, our algorithm does not attempt to maintain the 
trees within certain shape parameters. 

Recall that a node in the overlap graph (or forest) corresponds to a cycle in 
the cycle graph. The extent [f.B, f.E] of a node / of the overlap forest is the 
extent of the set of nodes in the subtree rooted at /. Let Fq be the trivial forest 
set up in the first scan and assume that the algorithm has processed elements 0 
through j — 1 of the permutation, producing forest Fj_i. We construct Fj from 
Fj-i as follows. Let / be the cycle containing element j of the permutation. If 
j is the beginning of its own cycle /, then it must be the root of a single-node 
tree; otherwise, if / overlaps with another cycle g, then we add a new arc (g, /) 
and compute the combined extent of g and of the tree rooted at /. We say that 
a tree rooted at / is aetive at stage j whenever j lies properly within the extent 
of /; we shall store the extent of the active trees in a stack. 

Figure 0summarizes our algorithm for constructing the overlap forest; in the 
algorithm, top denotes the top element of the stack. The conversion of a forest 
of up-trees into connected component labels is accomplished in linear time by 
a simple sweep of the array, taking advantage of the fact that the parent of i 
must appear before i in the array. 

Lemma 1. At iteration i of Step (3) of the algorithm, if the tree rooted at top 
is active and i lies on cycle f and we have f.B < top.B, then there exists h in 
the tree rooted at top such that h overlaps with f . 

Proof. Since top is active, it must have been pushed onto the stack before 
the current iteration (top.B < i) and we must not have reached the end 
of top's extent (i < top.E). Hence, i must be contained in top's extent 
{top.B < i < top.E). Since i lies on the cycle / that begins before top 
{f.B < top.B), there must be an edge from cycle / that overlaps with top. 
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Input: permutation 

Output: parent[i], the parent of i in the overlap forest 

Begin 

1. scan the permutation, label each position i with and set up \C[i\.B,C\i\.E\ 

2. initialize empty stack 

3. for f 0 to 2n + 1 

a) \U = C[i].B 

then push C[i] 

b) extent -<r- C\i] 

while (top.B > C[i].B) 

extent. B min{extent._B, top.B} 

extent. E <— max{extent.E, top.E} 
pop top 

parent[top.B] ■(— C[i].B 
endwhile 

top.B min{extent.B, top.B} 
top.E <— max{extent.E, top.E} 

c) if i = top.E 

then pop top 

4. convert each tree into a labeling of its vertices 

End 



Fig. 2. Constructing the Interleaving Forest in Linear Time 



Theorem 1. The algorithm produces a forest in which each tree is composed 
of exactly those nodes that form a connected component. 

Proof. It suffices to show that, after each iteration, the trees in the forest cor- 
respond exactly to the connected components determined by the permutation 
values scanned up to that point. We prove this invariant by induction on the 
number of applications of Step (3) of the algorithm. 

The base case is trivial: each tree of Fg has a single node and no two nodes 
belong to the same connected component since we have not yet processed any 
element of the permutation. 

Assume that the invariant holds after the {i — l)st iteration and let i lie on 
cycle /. We prove that the nodes of the tree containing i form the same set as 
the nodes of the connected component containing i — other trees and connected 
components are unaffected and so still obey the invariant. 

— We prove that a node in the tree containing i must be in the same connected 
component as i. 

If we have i = f.B, then, as we remarked earlier, nothing changes in the over- 
lap graph (and thus in the connected components); from Step (3), it is also 
clear that the forest remains unchanged, so that the invariant is preserved. 
On the other hand, if we have i > f.B, then at Step (3) the edge {top, /) 
will be added to the forest whenever f.B < top.B holds. This edge will join 
the subtree rooted at / with that rooted at top into a single subtree. From 
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Lemma ^ we also know that, whenever f.B < top.B holds, there must 
exist h in the tree rooted at top such that h and / overlap, so that edge 
{h, f) must belong to the overlap graph, thereby connecting the component 
containing / with that containing top and merging them into a single 
connected component, which maintains the invariant. 

~ We prove that a node in the same connected component as i must be in the 
tree containing i. Whenever (j,i) and with j < k < i < I, are gray 

edges on cycles / and h respectively, then edge (/, h) must belong to the 
overlap graph built from the first i entries of the permutation. In such a 
case, our algorithm ensures that edge {h, f) belongs the overlap forest. Our 
conclusion follows. 

Obviously, each step of the algorithm takes linear time, so the entire algorithm 
runs in worst-case linear time. 

6 Experiments 

Programs. The implementation due to Hannenhalli is very slow and implements 
the original method of Hannenhalli and Pevzner and not the faster one of Berman 
and Hannenhalli. The KST applet is very slow as well since it explicitly con- 
structs the overlap graph; it is also written in Java which makes it difficult to 
compare with C code. For these reasons we wrote our own implementation of 
the Berman and Hannenhalli algorithm (just the part handling the distance 
computation) with a view to efficiency. Thus, we not only have an efficient im- 
plementation to compare to our linear-time algorithm, but also we have ensured 
that the two implementations are truly comparable because they share much of 
their code (hurdles, fortresses, breakpoints), were written by the same person, 
and used the same algorithmic engineering techniques. 

Experimental Setup. We ran experiments on signed permutations of length 10, 
20, 40, 80, 160, 320, and 640, in order to verify rate of growth as a function of the 
number of genes and also to cover the full range of biological applications. We 
generated groups of 3 signed permutations from the identity permutation using 
the evolutionary model of Nadeau and Taylor in this model, randomly 
chosen inversions are applied to the permutation at a node to generate the 
permutations labelling its children, repeating the process until all nodes have 
been assigned a permutation. The expected number of inversions per edge, r, 
is fixed in advance, reflecting assumptions about the evolutionary rate in the 
model. We use 5 evolutionary rates: 4, 16, 64, 256, and 1024 inversions per edge 
and generated 10 groups of 3-leaf trees — or 10 groups of 3 genomes each — at each 
of the 6 selected lengths. We also generated 10 groups of 3 random permutations 
(from a uniform distribution) at each length to provide an extreme test case. For 
each of these 36 test suites, we computed the 3 distances among the 3 genomes 
in each group 20,000 times in a tight loop, in order to provide accurate timing 
values for a single computation, then averaged the values over the 10 groups 
and computed the standard deviation. The computed inversion distances are 
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Fig. 3. The inversion distance as a function of the size of the signed permutation. 



expected to be at most twice the evolutionary rate since there are two tree edges 
between each pair of genomes. Our linear algorithm exhibited very consistent 
behavior throughout, with standard deviations never exceeding 2% of the mean; 
the UF algorithm showed more variation for r = 4 and r = 16. 

We ran all our tests on a 300MHz Pentium II with 16KB of LI data 
cache, 16KB of LI instruction cache, and 512KB of L2 cache running at half 
clockspeed; our codes were compiled under Linux with the GNU gcc compiler 
with options -03 -mpentiumpro. Our code also runs on other systems and 
machines (e.g., Solaris and Microsoft), where we observed the same behavior. 

Experimental Results. We present our results in four plots. The first two (Figure0 
show the actual running time of our linear-time algorithm for the computation of 
the inversion distance between two permutations as a function of the size of the 
permutation, with one plot for the computation of the connected components 
alone and the other for the complete distance computation. Each plot shows one 
curve each for the various evolutionary rates and one for the random permuta- 
tions. We added a third plot showing the average inversion distance; note the 
very close correlation between the distance and the running time. 

For small permutation sizes (10 or less), the LI data cache holds all of the 
data without any cache misses, but, as the permutation size grows, the hit rate 
in the direct-mapped LI cache steadily decreases until, for permutations of size 
100 and larger, execution has slowed down to the speed of the L2 cache (a 
ratio of 2). From that point on, it is clear that the rate of growth is linear, as 
predicted. It is also clear that r = 1024 is as high a rate of evolution as we need 
to test, since the number of connected components and inversion distance are 
nearly indistinguishable from those of the random permutations (see the plot 
in Figure 01 that plots inversion distance as a function of the permutation size). 
The speed is remarkable: for a typical genome of 100 gene fragments (as found 
in chloroplast data, for instance P|), well over 20,000 distance computations 
can be carried out every second on our rather slow workstation. 
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Fig. 4. The running time of our linear-time algorithms as a function of the size of the 
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Our second two plots (Figure |HI) compare the speed of our linear-time 
algorithm and that of the UF approach. We plot speedup ratios, i.e., the ratio 
of the running time of the UF approach to that of our linear-time algorithm. 
Again, the first plot addresses only the connected components part of the 
computation, while the second captures the complete distance computation. 

Since the two approaches use a different amount of memory and give rise to a 
very different pattern of addressing (and thus different cache conflicts), the ratios 
up to permutations of size 100 vary quite a bit as the size increases — reflecting 
a transition from the speed of the LI cache to that of the L2 cache at different 
permutation sizes for the two algorithms. Beyond that point, however, the ratios 
stabilize and clearly demonstrate the gain of our algorithm, a gain that increases 
with increasing permutation size as well as with increasing evolutionary rate. 



7 Concluding Remarks 

We have presented a new, very simple, practical, linear-time algorithm for 
computing the inversion distance between two signed permutations, along with 
a detailed experimental study comparing the running time of our algorithm 
with that of Berman and Hannenhalli. Our code is available from the web page 
www.cs.unm.edu/^moret/GRAPPA under the terms of the GNU Public License 
(GPL); it has been tested under Linux (including the parallel version), FreeBSD, 
Solaris, and Windows NT. This code includes inversion distance as part of a 
much larger context, which provides means of reconstructing phytogenies based 
on gene order data. We found that using our inversion distance computation in 
lieu of the surrogate breakpoint distance (which was used by previous researchers 
in an attempt to speed up computation EE3) only slowed down the recon- 
struction algorithm by about 30%, enabling us to extend work on breakpoint 
analysis (as reported in [t)ll 5) 1 to similar work on inversion phylogeny. 
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Abstract. Given a set of species and their similarity data, an important 
problem in evolutionary biology is how to reconstruct a phylogeny (also 
called evolutionary tree) so that species are close in the phylogeny if and 
only if they have high similarity. Assume that the similarity data are 
represented as a graph G = {V, E) where each vertex represents a species 
and two vertices are adjacent if they represent species of high similarity. 

The phylogeny reconstruction problem can then be abstracted as the 
problem of finding a (phylogenetic) tree T from the given graph G such 
that (1) T has no degree-2 internal nodes, (2) the external nodes (i. e. 
leaves) of T are exactly the elements of V , and (3) (u, v) G E if and only 
if (1t{u,v) < k for some fixed threshold k, where dT(u,v) denotes the 
distance between u and v in tree T. This is called the Phylogenetic 
fexH Root Problem (PRfc), and such a tree T, if exists, is called a 
phylogenetic fcth root of graph G. The computational complexity of PRfc 
is open, except for fc < 4. In this paper, we investigate PRfc under a 
natural restriction that the maximum degree of the phylogenetic root is 
bounded from above by a constant. Our main contribution is a linear- 
time algorithm that determines if G has such a phylogenetic fcth root, and 
if so, demonstrates one. On the other hand, as in practice the collected 
similarity data are usually not perfect and may contain errors, we propose 
to study a generalized version of PRfc where the output phylogeny is only 
required to be an approximate root of the input graph. We show that this 
and other related problems are computationally intractable. 
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1 Introduction 

The reconstruction of evolutionary history for a set of species from quantitative 
biological data has long been a popular problem in computational biology. This 
evolutionary history is typically modeled by an evolutionary tree or phylogeny. 
A phylogeny is a tree where the leaves are labeled by species and each internal 
node represents a speciation event whereby an ancestral species gives rise to 
two or more child species. Both rooted and unrooted trees have been used to 
describe phytogenies in the literature, although they are practically equivalent. 
In this paper, we will consider only unrooted phytogenies for the convenience of 
presentation. 0 The internal nodes of a phylogeny have degrees (in the sense of 
unrooted trees, i.e. the number of incident edges) at least 3. Proximity within a 
phylogeny in general corresponds to similarity in evolutionary characteristics. 

Many phylogenetic reconstruction algorithms have been proposed and stud- 
ied in the literature P2|. In this paper we investigate the computational feasibil- 
ity of a graph-theoretic approach for reconstructing phytogenies from similarity 
data. Specifically, interspecies similarity is represented by a graph where the 
vertices are the species and the adjacency relation represents evidence of evolu- 
tionary similarity. A phylogeny is then reconstructed from the graph such that 
the leaves of the phylogeny are labeled by vertices of the graph {i.e. species) and 
for any two vertices of the graph, they are adjacent in the graph if and only if 
their corresponding leaves in the phylogeny are connected by a path of length 
at most k, where k is a, predetermined proximity threshold. To be clear, vertices 
in the graph are called vertices while those in the phylogeny nodes. Recall that 
the length of the (unique) path connecting two nodes u and v in phylogeny T is 
the number of edges on the path, which is denoted by dT{u,v). This approach 
gives rise to the following algorithmic problem |^: 

Phylogenetic A:th Root Problem (FRk): 

Given a graph G = (V,E), find a phylogeny T with leaves labeled by the 
elements of V such that for each pair of vertices u,v G V, {u, v) G E if and 
only if dT{u,v) < k. 

Such a phylogeny T (if exists) is called a phylogenetic kth root, or a fcth root 
phylogeny, of graph G. Graph G is called the fcth phylogenetic power of T. For 
convenience, we denote the A:th phylogenetic power of any phylogeny T as T*. 
Thus, PR/c asks for a phylogeny T such that G = T^ . 

1.1 Connection to Graph and Tree Roots, and Previous Results 

Phylogenetic power might be thought of as a Steiner extension of the standard 
notion of graph power. A graph G is the fcth power of a graph H (or equivalently, 
H is a kth root of G) if vertices u and v are adjacent in G if and only if the 
length of the shortest path from m to u in iJ is at most k. An important special 
case of graph power/root problems is the following: 

^ But some of our hardness proofs will also use rooted trees as intermediate data 
structures in the construction. 
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Tree /cth Root Problem (TR/c): 

Given a graph G = (V,E), find a tree T = (V, Et) such that (u,v) G E ii 
and only if dT{u,v) < k. 

If T exists then it is called a tree fcth root, or a fcth root tree, of graph G. 

The special case TR2 is also known as the Tree Square Root Problem 
|E]. Correspondingly, we call PR2 the Phylogenetic Square Root Prob- 
lem. There is rich literature on graph root and power (see | 2 | Section 10.6] for an 
overview), but few results on phylogenetic/ tree roots/powers. It is NP-complete 
to recognize a graph power m, nonetheless, it is possible to determine if a graph 
has a fcth root tree, for any fixed fc, in O(n^) time, where n is the number of 
vertices in the input graph jSj. In particular, determining if a graph has a tree 
square root can be done in 0(n -I- e) time P), where e is the number of edges 
in the input graph. Recently, Nishimura, Ragde, and Thilikos HH presented an 
0(n^)-time algorithm for a variant of PRfc, for fc < 4, where internal nodes of 
the output phytogeny are allowed to have degree 2. More recently, Lin, Kear- 
ney, and Jiang 0 introduced a novel notion of critical clique and obtained an 
0(n -I- e)-time algorithm for PRfc, for fc < 4. Unfortunately, both algorithms 
cannot be generalized to fc > 5. 



1.2 Our Contribution 

In the practice of phytogeny reconstruction, most phytogenies considered are 
trees of degree 3 because speciation events are usually bifurcating events in 
the evolutionary process. In such fully resolved phylogenetic trees, each internal 
node has three neighbors and represents a speciation event that some ancestral 
species splits into two child species. Nodes of degrees higher than 3 are introduced 
only when the input biological (similarity) data is not sufficient to separate 
individual speciation events and hence several such events may be collapsed into 
a non-bifurcating (super) speciation event in the reconstructed phylogeny. Hence 
in this paper, we consider a restricted version of PRfc where the output phylogeny 
is assumed to have degree at most A, for some fixed constant Z\ > 3. For 
simplicity, we call it the Degree-Z\ PRfc and denote it in short as ZlPRfc. Since 
in the practice of computational biology the set of species under consideration 
are more or less related, we are mostly interested in connected graphs. The main 
contribution of this paper is a linear-time algorithm that determines, for any 
input connected graph G and constant Z\ > 3, if G has a fcth root phylogeny 
with degree at most A, and if so, demonstrates one such phylogeny. The basic 
construction in our algorithm is a nontrivial application of bounded-width tree- 
decomposition of certain chordal graphs |2| . 

Notice that the input graph in PRfc is derived from some similarity data, 
which is usually inexact in practice and may have erroneous (spurious or missing) 
edges. Such errors may result in graphs that has no phylogenetic roots. Hence, 
it is natural to consider a more relaxed problem where we look for phylogenetic 
trees whose powers are close to the input graphs. The precise formulation is as 
follows: 
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Closest Phylogenetic /cth Root Problem (CPR/c): 

Given a graph G = (V,E) and a nonnegative integer £, find a phylogeny 
T with leaves labeled by V such that G and differ by at most £ edges. 
That is, 

\E{G) ® = \ {E{G) - E{T'^)) U {E{T'^) - E{G)) \ < £. 

A phylogeny T which minimizes the above edge discrepancy is called a closest 
kth root phylogeny of graph G. 

The Closest Tree /cth Root Problem (CTRk) is defined analogously. 
Notice that CTRl is trivially solved by finding a spanning tree of the input 
graph. Kearney and Corneil |S| proved that CTRfc is NP-complete when fc > 3. 
The computational complexity for CTR2 had been open for a while and is 
recently shown to be intractable by Jiang, Lin, and Xu In this paper, we 
will show that CPRfc is NP-complete, for any k > 2. Another closely related 
problem, the Steiner fcTH Root Problem (where k > 1), is also studied. 

We introduce some notations and definitions, as well as some existing related 
results, in the next section. Our main result on bounded-degree PRfc is presented 
in Section 3. The hardness of closest phylogenetic root and Steiner root problems 
is discussed in Section 4. We close the paper with some remarks in Section 5. 

Due to the page limit, we will omit all proofs and most of the details of the 
constructions for the hardness results in this extended abstract. The proofs and 
detailed constructions can be found in the full version |2|. 

2 Preliminaries 

We employ standard terminologies in graph theory. In particular, the subgraph 
of a graph G induced by a vertex set U of G is denoted by G[U], the degree of a 
vertex in G is denoted by degc{v), and the maximum size of a clique in G is 
denoted by w(G). First, it is obvious that if a graph has a fcth root phylogeny, 
then it must be chordal, that is, it contains no induced subgraph which is a cycle 
of size greater than 3. 

Definition 1. A tree-decomposition of a graph G = (V, E) is a pair V = (T, B) 
consisting of a tree T = (U,E) and a collection B = {Ba \ QV,a & U} of 
sets ( called bags ) for which 

~ UaGlZ 

— for each edge (vi,V2) S E, there is a node a G U such that {vi,V2} C Ba, 
and 

— if OL2 G U is on the path connecting a\ and 03 in T , then Ba^ H C Ba^ ■ 
The treewidth associated with this tree-decomposition T> = (fT,B) is tw{G,T>) = 

maxagc/ \Ba\ - 1. 

The treewidth of graph G, denoted by tw{G), is the minimum tw{G,T>) taken 
over all tree- decompositions T> of G. 
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A clique-tree-decomposition of G is a tree- decomposition (T, B) of G such 
that each bag in B is a maximal clique of G. 

Lemma 1. jS| Every chordal graph has a clique-tree- decomposition. 

From the proof of Lemmad given in 0, it is not difficult to see that a clique- 
tree-decomposition V = (T, B) of a given chordal graph G can be computed in 
linear time if w(G) = 0(1). We can further modify T> so that degq-{a) < 3 for 
each node a of T d- This modification takes linear time too if uj{G) = 0(1). 

Hereafter, a tree-decomposition of a chordal graph G always means a clique- 
tree-decomposition T> = (T,B) of G such that deg-r{a) < 3 for all nodes of T. 
Furthermore, in the sequel, we abuse the notations to use T> to denote the tree 
T in it (since we will use T to denote the fcth root phytogeny of graph G), and 
denote the bag associated with a node a oiV hy Ba- 

3 Algorithm for Bounded-Degree PRfc 

This section presents a linear-time algorithm for solving 3PRfc. The adaptation 
to Z\PRfc where Z\ > 4 is straightforward and is hence omitted here. 

We assume that the input graph G = {V,E) \s not complete but is chordal; 
otherwise the problem is trivially solved in linear time. Since every vertex v G V 
appears as an external node {i.e. leaf) in the fcth root phylogeny, the maximum 
size w(G) of a clique in G can be bounded from above by a constant /(fc), where 



So, we can construct a clique-tree-decomposition 2? of G in linear time. The basic 
idea behind our algorithm is to do a dynamic programming on a rooted version 
of the decomposition 1). The dynamic programming starts at the leaves of T> 
and proceeds upwards. After processing the root, the algorithm will construct a 
fcth root phylogeny of G if there is any. The processing of a node a ofT> can be 
sketched as follows. Let Cfy be the union of the bags associated with a and its 
descendants in I). While processing a, the algorithm computes a set of trees T 
such that (1) T may possibly be a subtree of a fcth root phylogeny of G, (2) all 
vertices of Ua are leaves of T, and (3) each leaf of T not contained in Uq, is not 
labeled. The unlabeled leaves of T serve as ports from which we can expand T 
so that it may eventually become a fcth root phylogeny of G. The crucial point 
we will observe is that we only need those ports that are at distance at most fc 
apart from vertices of Ba in T. This point implies that the number of necessary 
ports only depends on fc and hence is a constant. 

One more notation is in order. For two adjacent nodes a and [3 of T>, let 
U{a,j3) = R-y where 7 ranges over all nodes of V with d-Di'jja) < d-D(7,/3). 
In other words, if we root V at node /3, then U{a,(3) is the union of the bags 
associated with a and its descendants in T>. A useful property of T> is that for 
every internal node fS and every two neighbors ai and «2 of j3 in'D, G has no 
edge between a vertex of U(ai,/3) — Bp and a vertex of U{a 2 ,P) — Bp. 
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3.1 Ideas behind the Dynamic Programming Algorithm 

Note that since Z\ = 3, every internal node in a /cth root phylogeny T of G has 
degree exactly 3. 

Definition 2. Let U he a set of vertices of G. A relaxed phylogeny for U is a 
tree R satisfying the following conditions: 

— The degree of each internal node in R is 3. 

— Each vertex ofU is a leaf in R and appears in R only once. For convenience, 
we call the leaves of R that are also vertices of U final leaves of R, and call 
the rest leaves of R temporary leaves of R. 

— For every two vertices u and v of U , u and v are adjacent in G if and only 
if dR{u,v) < k. 

— Each temporary leaf v of R is assigned a pair (7,t), where j is a node ofV 
and 0 < t < k. We call 7 the color of v and call t the threshold of v. For 
convenience, we denote the color of a temporary leaf v of R by cr{v), and 
denote the threshold of v by tn{v). 

Intuitively speaking, the temporary leaves of R serve as ports from which we 
can expand R so that it may eventually become a kth root phylogeny of G. 

Recall that our algorithm processes the nodes of T> one by one. While pro- 
cessing a node a of T>, the algorithm finds out all relaxed phytogenies for Ba 
that are subtrees of fcth root phytogenies of graph G. The following lemma shows 
that such relaxed phytogenies for have certain useful properties. 

Lemma 2. Let T he a kth root phylogeny of G. Let a he a node ofT>. Root T 
at an arbitrary leaf that is in Ba- Define a pure node to he a node w of T such 
that a has a neighbor j in V such that all leaf descendants of w in T are in 
17(7,0) — Ba- Define a critical node to be a pure node ofT whose parent is not 
pure. Let R be the relaxed phylogeny for Ba obtained from T by performing the 
following steps: 

1. For every critical node w ofT, perform the following: 

a) Compute the minimum distance from w to a leaf descendant ofw in T; let 
iw denote this distance. (Comment: i^ < k or else the leaf descendants 
of w in tree T would he unreachable from the outside in graph G.) 

b) Find the neighbor j of a such that all leaf descendants of w in T are 
contained in [7(7,0;). 

c) Delete all descendants (excluding w, of course) ofw, and assign the pair 
(^,iw) to w. 

2. Unroot T. 

Then, R has the following properties: 

— For every temporary leaf v of R, cr{v) is a neighbor of a in D. 

— For every two temporary leaves u and v of R with different colors, it holds 
that tufa) + tn{v) + dn{u, v) > k. 
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— For every neighbor 'y of a in T>, every temporary leafv of R with cr{v) = 7, 
and every final leaf w of R with w ^ B^, it holds that dn{v,w) +tn{v) > k. 

— For every internal node v of R, either at least one descendant of v is a final 
leaf of R, or there is a final leaf u of R with dii{u, v) < k — 1. 

Each relaxed phylogeny R for Ba having the four properties in Lemma 0 
is called a skeleton of a. The following lemma shows that there can be only a 
constant number of skeletons of a. 

Lemma 3. For each node a ofT>, the number of skeletons of a is bounded from 
above by a constant depending only on k and \Ba\- 

By Lemma 0 while processing a node a of T>, our algorithm can find out 
all skeletons of a in constant time. For each skeleton S of a, if possible, the 
algorithm then extends S' to a relaxed phylogeny for U{a,/3) where P is the 
parent of a in rooted T>. The algorithm records these relaxed phylogenies of a 
in the dynamic programming table for later use when processing the parent /?. 
The following definition aims at removing unnecessary relaxed phylogenies of a 
from the dynamic programming table. 

Definition 3. Let a and P be two adjacent nodes ofT>. Let S be a skeleton of 
a. The projection of S to P is a relaxed phylogeny for B^ Cl B^ obtained from S 
by performing the following steps: 

1. Change each final leaf v (ji Bp to a temporary leaf; set the threshold of v to 
be 0 and set the color of v to be a. 

2. Root S at an arbitrary vertex of B^ Cl Bp. 

3. Find those nodes v in S such that (i) every leaf descendant of v in S is a 
temporary leaf whose color is not P, but (ii) the parent of v in S does not 
have Property (i). 

4-. For each node v found in the last step, if v is a leaf in S then set the color 
of v to be a; otherwise, perform the following steps: 

a) Set my = min„{ts(u)+(is(M, u)} where u ranges over all leaf descendants 
u of v in S. 

b) Delete all descendants of v in S. 

c) Set V to be a temporary leaf, set the color of v to be a, and set the 
threshold of v to be my . 

5. Unroot S. 

Obviously, two different skeletons of a may have the same projection to p. 
For convenience, we say that these skeletons are equivalent. Among equivalent 
skeletons of a, our algorithm will extend only a hopeful one of them to a relaxed 
phylogeny for U{a,P) and record it in the dynamic programming table. This 
motivates the following definition: 

Definition 4. Let a and P be two adjacent nodes ofD. A projection of a to P 
is the projection of a skeleton of a to p. Let P be a projection of a to p. An 
expansion of P to U{a,P) is a relaxed phylogeny X for U{a,P) such that some 
subtree Y of X is isomorphic to P, and the bijection f from the node set of P to 
the node set of Y witnessing this isomorphism satisfies the following conditions: 
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— For every final leaf v of P, f{v) = v. 

— For every temporary leaf v of P with cp(v) = 3, f(v) is a temporary leaf of 
X wUh lifiv)) Jcpii) and t,(/(.)) i !,(.). 

— Suppose that we root X at a vertex in Ba H Bp. Then, for every temporary 
leaf v of P with cp{v) ^ [3 (hence cp{v) = a), all leaf descendants of f{v) 
in X are final leaves and are contained in U{a,(3) — Bp, and the minimum 
distance between f{v) and a leaf descendant of f{v) in X equals to tp{v). 

Note that a projection of a to /3 may have no expansion to U{a,(3). The 
following lemma shows that if G has a fcth root phylogeny T, then some subtree 
of T is a projection of a to /3 and another subtree of T is its expansion to U{a, (3). 

Lemma 4. Let a and (3 be two adjacent nodes in T>. Let T be a kth root phy- 
logeny of G. Root T at an arbitrary leaf that is in Ba- Let R be the skeleton of 
a obtained from T as in Lemma^ Let P be the projection of R to (3. Define a 
/3-pure node to be a node w of T such that all leaf descendants of w in T are 
contained in Lf{f3,a) — Ba- Further define a /3-critical node to be a (3-pure node 
of T whose parent is not fi-pure. Let X be the relaxed phylogeny for U{a,f3) 
obtained from T by performing the following steps: 

1. For every [3-critical node w ofT, perform the following: 

a) Compute the minimum distance from w to a leaf descendant ofw in T; let 
iw denote this distance. (Comment: iy„ < k or else the leaf descendants 
of w in tree T would be unreachable from the outside in graph G.) 

b) Delete all descendants (excluding w, of course) ofw, and assign the pair 
iP,iw) to w. 

2. Unroot T. 

Then, X is an expansion of P to U{a,(3). 

By Lemma 21 whenever G has a fcth root phylogeny, there is always a pro- 
jection of a to /3 that has an expansion to U(a,(3). While processing a, our 
algorithm will find out those projections that have expansions to U{a,(3), and 
record the expansions in the dynamic programming table. 



3.2 Details of Dynamic Programming for 3PRfc 

To solve the 3PRfc problem for G, we perform a dynamic programming on the 
tree-decomposition D as follows. To simplify the description of the algorithm, 
we add a new node r to D, connect r to an arbitrary leaf a of D, and copy the 
bag at a to r (that is, Br- = Ba). Clearly, the resultant D is still a required 
tree-decomposition of G. Root Da.tr. 

The dynamic programming starts at the leaves of D, and proceeds upwards; 
after the unique child of the root r of U is processed, we will know whether G 
has a fcth root phylogeny or not. The invariant maintained during the dynamic 
programming is that after each non-root node a has been processed, for each 
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projection P of a to its parent (3, we will have found out whether P has an 
expansion X to U{a,P), and will have found and recorded such an X if any. 

Now consider how a non-root node a of P is processed. Let /3 be the parent 
of a in T>. First suppose that a is a leaf of T>. When processing a, we find and 
record all possible projections of a to /3; moreover, for each projection P found, 
we also record a skeleton S' of a such that P is the projection of S to (3. 

Next suppose that a is neither a leaf nor the root node of P, and suppose that 
all descendants of a in 2? have been processed. To process a, we try all possible 
skeletons S of a. When trying S, for each child 7 of a in T>, we first compute 
the projection Py of S to 7, and then check whether P~^ is also a projection of 
7 to a and additionally has an expansion to U(^,a). If the checking fails for at 
least one child of a, we proceed to try the next possible skeleton of a. Otherwise, 
we can conclude that the projection P,g oi S to (3 has an expansion to U{a,(3) 
because such an expansion can be constructed as follows: 

1. For each child 7 of a in T>, search the dynamic programming table to find 
the expansion X^ of P-y to P(7, a), and find the bijection (from the node 
set of Py to the node set of some subtree of X^) witnessing that X^ is an 
expansion of Py to 

( Comment: To speed up the algorithm, we may have recorded this bijection 
in the dynamic programming table when processing 7.) 

2. For each child 7 of a in V, root X^ at an arbitrary vertex of fl P^. 

3. Modify S as follows: For each temporary leaf v of S with cs{v) yf /3, replace 
V by the subtree rooted at /^(u) of X^, where 7 = cs{v). 

( Comment: Recall that by Definition |3 each temporary leaf u of S' with 
cs{i>) = 7 is also a temporary leaf of Py.) 

One can verify that the above construction indeed gives us an expansion of 
P/3. Since P/3 is a possible projection of a to (3, we record this expansion for 
P/3 in the dynamic programming table. After trying all possible skeletons of a, 
if we find no projection of a to /3 that has an expansion to U{a,f3), then we 
can conclude that G has no kth root phytogeny; otherwise, we proceed to the 
processing of the next node of T>. 

Finally, suppose that a is the unique child of the root r of T> . Further suppose 
that a has been successfully processed; otherwise we already knew that G has 
no fcth root phytogeny. Then, by searching the dynamic programming table, we 
try to find a projection P of a to r such that (i) P has no temporary leaf whose 
color is r, and (ii) an expansion A of P to U{a,r) has been recorded in the 
dynamic programming table. If P is found, we can conclude that A is a kth root 
phytogeny for G; otherwise, we can conclude that G has no fcth root phytogeny. 
The above discussion justifies the following theorem: 

Theorem 1. Let k be a constant integer larger than or equal to 2. There is 
a linear-time algorithm determining if a given graph has a kth root phytogeny 
in which every internal node has degree 3, and if so, demonstrating one such 
phylogeny. 
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We can easily generalize the above discussion to prove the following: 

Corollary 1. Let A and k be eonstant integers sueh that Z\ > 3 and k > 2. 
There is a linear-time algorithm determining if a given graph has a kth root 
phytogeny in whieh every internal node has degree in the range [3,Z\], and if so, 
demonstrating one sueh phytogeny. 

4 The Hardness of Closest Phylogenetic Root Problems 

Due to the page limit, we will only introduce some basic concepts that are useful 
in proving our hardness results, and refer the reader to the full version |3| for a 
detailed discussion of the hardness results and their proofs. 

Consider a set S = {si, S 2 , ■ ■ ■ , Sn}- Let M be a symmetric matrix with rows 
and columns indexed by the elements of S'. M is a binary dissimilarity matrix 
on set S ii M{si,Sj) G {1,2} for every pair (si, Sj) of distinct elements of S and 
M{si,Si) = 0 for every element Si G S. 

A tree T is a 2-ultrametric on set S if T is a rooted tree whose leaves are 
labeled by the elements of S and each leaf-to-root path contains exactly two 
edges. Call a node in T that is neither a leaf nor the root a middle node, to 
avoid ambiguity. The half-distance between two leaves Si and Sj, denoted by 
hx^Si, Sj), is one half of the number of edges on the unique path in T connecting 
Si and Sj. Clearly, hxisi, Sj) G (1, 2} if i yf j, and hxisi, Si) = 0 for every i. 
Given a binary dissimilarity matrix M and a 2-ultrametric T on set S, define 

D(T, M)=Y. ’ 

i<j 

which measures how well T matches the inter-leaf (half-)distances specified by 
M. H The following Fitting Ultrametric Trees (FUT) problem has been 
shown to be NP-complete by Kfivanek and Moravek J^. 

Fitting Ultrametric Trees (FUT): 

Given a binary dissimilarity matrix M on set S and a nonnegative integer 
£, decide if there is a 2-ultrametric T on S such that D{T, M) < £. 

Kearney and Corneil jSl proved that CTRfc is NP-hard when fc > 3 by 
a (polynomial-time) reduction from FUT (to CTR3). Using a more dextrous 
reduction, Jiang, Lin, and Xu Pj have recently shown that CTR2 is intractable 
too. To prove the intractability of CPRfc, we need one more definition. A critical 
clique |S| of a graph is a maximal subset of vertices that are adjacent to each 
other and have a common neighborhood. As for constructing a phylogenetic root, 
the vertices in a critical clique can be identified because they are interchangable 
in every phylogenetic root. 

Using the above concepts and some very careful and involved construction, 
we are able to prove the following hardness result: 

^ So, here the entries in M are supposed to represent the half-distances between species 
instead of full distances. 
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Theorem 2. CPRfc is NP-complete, for any k >2. 

We next study another problem closely related to PRfc and TRfc, which is 
the Steiner A:th Root Problem [S|. Recall that given a graph G = (V,E), 
TR/c asks for a tree whose node set is exactly V and PRA: asks for a tree whose 
leaf-set is exactly V. A more general problem is to ask for a tree T whose node 
set is a superset of V and whose leaf-set is a subset of V, and such that for every 
pair of vertices u and v in V, (1t{u,v) < fc if and only if (u,u) C E. We call a 
tree T, whose node set is a superset of V and whose leaf-set is a subset of V, a 
Steiner tree on V . The nodes of T that are not in V are called the Steiner nodes. 

Steiner fcTH Root Problem (SRfc): 

Given a graph G = (V,E), find a Steiner tree T on P such that for each 
pair of vertices u,v € V, (u, v) € E if and only if driu, v) < k. 

Such a Steiner tree T (if exists) is called a Steiner fcth root or a /cth root Steiner 
tree of G. G is called the fcth Steiner power of T. We also abuse to denote 
the fcth Steiner power of T, when there is no confusion from the context. 

Notice that we do not require here a non-leaf node in a Steiner tree to have 
degree at least 3. This requirement is not necessary from the tree root point of 
view. But, one may do so as this requirement is natural from the phylogenetic 
root point of view. Steiner trees satisfying this additional requirements are called 
restricted Steiner trees. Graphs having a restricted Steiner fcth root, for fc = 1, 2, 
can be recognized in linear time 0. The recognition algorithm can be extended 
to find an ordinary Steiner fcth root, for fc = 1 and fc = 2. However, when fc > 3, 
no polynomial-time recognition algorithm has been reported yet to find either 
a Steiner fcth root or a restricted Steiner fcth root. In the following, we will 
only consider ordinary Steiner roots and show that the closest Steiner fcth root 
problem (CSRfc), defined in a straightforward way, is NP-complete when fc > 2. 

For CSRl, we notice that deleting all Steiner nodes from an (approximate) 
1st root Steiner tree T results in a collection of subtrees such that vertices in 
different subtrees are not adjacent in T^. Therefore, for any input graph G, the 
best way to build the closest 1st root Steiner tree is to construct a spanning 
tree for each connected component in G and then connect these spanning trees 
together via a Steiner node. That is, a closest 1st root Steiner tree can be com- 
puted in 0(n) time, where n is the number of vertices in the input graph. The 
complexity changes when fc marches from 1 to 2. 

Theorem 3. CSRfc is NP-complete, for any fc > 2. 



5 Closing Remarks 

Since CPRfc is NP-complete for all fc > 2, it would be interesting to know how 
well we can approximate the closest phylogenetic fcth root. 

Acknowledgments. CL would like to thank Paul Kearney for bringing CPRfc 
to his attention. Bin Ma and Jinbo Xu for many helpful discussions. 
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Abstract. Layered Manufacturing allows physical prototypes of 3D 
parts to be built directly from their computer models, as a stack of 2D 
layers. This paper proposes a new approach, which decomposes the model 
into a small number of pieces, builds each separately, and glues them to- 
gether to generate the prototype. This allows large models to be built in 
parallel and also reduces the need for so-called support structures. De- 
composition algorithms that minimize support requirements are given for 
convex and non-convex polyhedra. Experiments, on convex polyhedra, 
show that the approach can reduce support requirements substantially. 



1 Introduction 

Layered Manufaeturing (LM) is an emerging technology that allows the construc- 
tion of physical prototypes of 3D parts directly from their CAD models using a 
“3D printer” attached to a personal computer. LM provides the designer with an 
additional level of physical verification that makes it possible to detect and cor- 
rect design flaws that may have, otherwise, gone unnoticed in the virtual model. 
It is used extensively in the automotive, aerospace, and medical industries. 

The basic principle underlying LM is simple: The CAD model, assumed to a 
surface triangulation in the industry-standard STL format, is oriented suitably 
and sliced into thin layers by horizontal planes. The layers are then sent over a 
network to a fabrication device which “prints” them one by one, each on top of 
the previous one, with the first layer resting on the platform of the fabrication 

* Research of II and RJ supported, in part, by NSF grant CCR-9712226 and by 
NIST grant 60NANB8D0002. Portions of this work were done when RJ visited the 
University of Magdeburg and JS and MS visited the University of Minnesota under 
a joint grant for international research from NSF and DAAD. 
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device. (The mechanics of this step depend on the specific LM process 0.) Since 
portions of a layer can overhang previous layers, support structures are generated 
automatically during the process to prop up the overhangs; these are removed 
subsequently via postprocessing. 

The performance of LM depends, in part, on certain geometric factors: For 
instance, the orientation of the part (or the build direction) determines the num- 
ber of layers, the quantity (i.e., volume) of supports, the location of supports on 
the part, the extent to which supports “stick” to the part (i.e., area of contact), 
and the surface finish and accuracy. The orientation also determines the shape of 
each layer (a polygon), which affects the choice of tool-paths during the printing 
stage. Currently, these issues are resolved by a human operator, based on ex- 
perience. The problem of automating these process-planning decisions has been 
addressed recently by researchers in computational geometry and CAD (details 
and references can be found in the full version 0 of this paper). 

Current process-planning algorithms for LM view the CAD model as a single, 
monolithic unit. We propose an approach which decomposes the model into a 
small number of pieces, builds each separately, and then glues them together 
to generate the prototype. (The number of pieces generated can be controlled 
by the user.) Specifically, given a decomposition direction, we decompose the 
model by intersecting it with a suitable plane perpendicular to this direction; 
we then build the pieces that lie in the same halfspace in the direction given by 
the normal to the plane that is contained in the halfspace. (Figure [U) 




Fig. 1. The decomposition-based approach, shown in 2D for convenience. The polyhe- 
dron V is decomposed by a plane H into polyhedra and which are then bnilt 
in the indicated directions. 



This approach has several advantages: It allows the construction of large 
models that cannot be accommodated in the workspace as a single piece. More- 
over, the model can be built very quickly by building the pieces in parallel. The 
method also lends itself naturally to applications where LM is used to make 
mold-halves for the model, which are then used to mass-produce the model via 
casting or injection molding. 
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A less obvious, but nevertheless crucial, advantage is that it can also reduce 
support requirements substantially. For instance, if a hollow sphere is built the 
conventional way, supports will be needed in the interior void and below the 
lower hemisphere; however, if it is built as two hemispheric shells, in opposite 
directions, then supports will be needed only in the regions previously occupied 
by the void, resulting in a reduction in both support volume and contact-area. 
Reducing support requirements is important because this translates into lower 
material costs and faster build times. 

1.1 Contributions 

We give efficient geometric algorithms to decompose polyhedral (i.e., STL) mod- 
els, w.r.t. a given decomposition direction, so that the support contact-area and, 
independently, the support volume is minimized. In Sections 0 and 0 we give 
plane-sweep-based algorithms for convex polyhedra that work by generating and 
optimizing expressions for the support volume and contact-area as a function 
of the height of the sweep plane. We also give experimental results that show 
substantial reductions in support requirements. In Section 0 we consider non- 
convex polyhedra, which are considerably more difficult because of the complex 
structure of the supports (Figure EJ. We handle these by first identifying certain 
critical facets (or parts thereof) using cylindrical decomposition 0, and then ap- 
plying the algorithm for convex polyhedra to these critical facets. We also give 
a method to keep the size of the decomposition within a user-specified limit. 




H H 

Fig. 2. Support structures (fight shading), shown in 2D. Supports in the non-convex 
case (left) exhibit complexities not seen in the convex case (right): (i) They can rest 
partially on other parts of the polyhedron; (ii) only a fraction of a facet may be in 
contact with supports; (iii) parallel facets can also be in contact with supports. 



To our knowledge, the only related prior work is due to Fekete and 
Mitchell 0. They consider decomposing a polyhedron into special polyhedra 
called histograms that can be built without supports. They prove that deciding 
if a polyhedron of genus zero (or a polygon with holes) can be decomposed into 
k histograms is NP-complete. Our work is different in that we allow supports 
(thus expanding the class of polyhedra buildable by LM), and we seek a decom- 
position into two polyhedra (w.r.t. a given direction) such that the total support 
requirement is minimized, but not necessarily zero. 
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Due to space limitations, we only sketch the main ideas behind our results 
and omit many details and proofs. These can be found in the full paper 0. 

2 Formalizing the Problem 

Let V be the polyhedron of interest. The facets of V are triangles and its bound- 
ary is assumed to be represented in some standard form, say, as a doubly- 
connected edge list jSj. (This can be computed easily from the standard STL 
representation of V.) Let d be a given decomposition direction, which, w.l.o.g., is 
the positive z direction. Let H be any plane perpendicular to d and intersecting 
V] H is the decomposition plane. Let be the closed polyhedron bounded by 
the facets of V (or portions thereof) that are above H, and by the facet V D H; 

may consist of multiple connected components. Define V~ symmetrically 
w.r.t. the part of V below H . Define the build direction for to be d and for 
V~ to be — d, and let H be the platform for both polyhedra. (Figure (D) 

We classify any facet / G V, w.r.t. the given decomposition direction d, as a 
front facet, a back facet, or a parallel facet of V depending on whether the angle 
between the decomposition direction d and the outward unit-normal, ny, of / 
is less than, greater than, or equal to 90° , respectively. 

We say that a facet of a polyhedron needs to be supported iff the angle between 
its outer normal and the build direction of the polyhedron is greater than 90°. 
Thus back facets of and front facets of V~ need to be supported. The support 
polyhedron for a back facet / € is the closure of the set of all points p € IR^ 
such that p is not in the interior of and the ray shot from p in direction d 
first enters P'*" through /. Informally, it is bounded from above by /, on the 
sides by vertical facets that “drop down” from the edges of /, and from below 
by the platform and/or portions of front facets of If is convex, then it 
is bounded from below by only the platform. (Figure El) 

The support contact-area for is the total surface area of that is in 
contact with supports. It includes the area of all the back facets of , except 
Vr\H , and the areas of those portions of front facets and parallel facets that are 
in contact with supports. (Facet VC\H rests on the platform and hence needs no 
supports. Note that while back facets are completely in contact with supports, 
front and parallel facets may be only partially in contact.) The support volume 
for is the total volume of the support polyhedra for all back facets / of 'P'^ 
(excluding V fl H). Symmetrically for V~ . We can now state our problem: 

Problem 1. Given a polyhedron V, with n vertices, and a decomposition direc- 
tion d, compute a plane H perpendicular to d which decomposes V into poly- 
hedra and V~ such that the total support contact-area or, independently, 
the support volume is minimized when and V~ are built in directions d and 
— d, respectively. Additionally, if the user specifies an integer K, then the plane 
H should be optimal over all planes that generate no more than a total of K 
connected components of and V~ . 
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3 Decomposing Convex Polyhedra: Contact-Area 

We sweep the plane H : z = h upwards over V. A facet / G 7^ is an active facet 
w.r.t. H if HD f $ and at least one vertex of / is strictly above H. Otherwise, 
/ is an inactive facet w.r.t. H. Intuitively, an active facet is contained partially 
in both and V~ , and small movements of H affect its contribution to the 
contact-area, whereas an inactive facet is completely contained in or V~ and 
its contribution to the contact-area is not affected by small movements of H. 

It is clear that if / is an inactive front facet then its contribution to the total 
contact-area is area{f) if it is in V~ , and zero if it is in P'*", where area(f) is the 
area of /. If / is an active front facet, then only the part, /“, of / that lies below 
H is in V~ , so / contributes area{f~) to the total contact-area. Symmetrically, 
if / is an inactive/active back facet. (In the convex case, parallel facets are never 
in contact with supports, and are hence ignored.) 

The expression for the total contact-area of and V~ consists of the 
inactive-area term and the active-area term, which are, respectively, the contact- 
area contributed by the inactive and active facets. If we move H up or down, 
without crossing a vertex, then the inactive-area does not change, so the inactive- 
area term is simply a real number. However, the active-area changes because the 
fraction of an active facet that contributes to the total contact-area changes as 
H is moved. Lemma [D shows that the active-area term is quadratic in h. 

Lemma 1. Let the current sweep plane he H : z = h. The total eontact-area 
contributed by the active facets (i.e., the active-area) is of the form Ah? -\-Bh-\-C , 
where the coefficients A, B, and C depend only on the coordinates of the vertices 
of the active facets. 

Proof. (Sketch) Let / be any active facet, with vertices Vi = (xi,yi,Zi), vj = 
{xj,yj,Zj), and Vk = (xk^yk, Zk). (Figured!) We will prove that the contact-area 
contributed by / is of the form afh'^ -\-bfh-\-Cf, where the coefficients a/, bf, and 
Cf depend only on the coordinates of the vertices of /. This implies the result. 

Let H intersect edge vfVj at Vij = (xy , , Zij = h) and edge vivk at Vik = 

ix^k,y^k,z^k = h). It is easy to verify that Xij = Xi-\- aj{h- Zi), ytj = yi + (}j{h- 
Zi), Xik = Xi + ak{h-Zi), and yik = yi+ !3k{h- Zi) , where aj = {xj-Xi)/{zj-Zi), 
ctk = {xk - Xi)!{zk - Zi), (3j = {yj - yi)/{zj - Zt), and Pk = {Vk ~ Vi)l{zk ~ zf). 

The part of / that is in contact with supports is the triangle /“, with vertices 
Vi, Vij, and Vik. We have area{f~) = - v^) x (v^^ - v*)|, where Vij-Vi = 

aj{h-Zi)i-\-Pj{h-Zi)^-\-{h-Zi)'k and = ak{h- Zi)i-\- Pk{h- zP^-\-{h- Zi)k. 

It follows that area{f~) = \{h- Zi)‘^{{Pj- Pk)"^ + {aj-ak)‘^ + {ajPk-akPj)‘^)^^‘^ ■ 

The coefficient of [h — Zi)"^ above is a constant which depends only on the 
coordinates of the vertices of /. (In fact, it is easy to verify that this constant 
coefficient is equal to area{f) / {{zj — Zi){zk — Zi)).) □ 

We sort the vertices of V by non-decreasing z-coordinates, as Vi,V 2 , ■ . ■ ,Vn, 
classify each facet / G P as a front, back, or parallel facet, and compute area{f). 
We set the active-area term to zero, the inactive-area term to the total area of 
the back facets, and the current estimate of the minimum contact-area to their 
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Fig. 3. Intersection of H and active front facet f. 



sum. We then scan the vertices in their sorted order. Let V( be the current vertex, 
1 < £ < n. For each facet / incident to ve, we do the following: 

Case 1: vi is the lowest vertex of f. (Thus, / changes from inactive to active 
at Uf.) If / is a back facet, then we subtract area{f) from the inactive-area term. 
Using Lemma 0 we compute and add the expression a/h^ + bfh + Cf to the 
active-area term, thereby including the contribution of /“*" to the total contact- 
area when H is between ze and If / is a front facet, then we update only 
the active-area term to include the contribution of f~ to the total contact-area. 
Case 2: vg is the highest vertex of /. This is symmetric to Case 1. 

Case 3: ve is the middle vertex of f. Here / continues to be active, but the 
active-area term must be updated since H intersects a different edge of / above 
V£ than it did below. We perform this update using Lemma ^ 

After all facets incident to V£ have been processed, we have a new active- 
area term Ah^ + Bh + C, valid for h in the interval [z£, We minimize this 

using standard techniques from calculus and update the current estimate of the 
minimum contact-area. The running time is dominated by the time to sort; the 
time to process any vertex is proportional to its degree, hence 0(n) in total. 

Theorem 1. The contact-area version of Problem^ can be solved in O(nlogn) 
time for a convex polyhedron V with n vertices (0(n) time if the vertices are 
given in sorted order in the decomposition direction dj. 



3.1 Experimental Results 

We have implemented the above algorithm in C-| — h, using floating point com- 
putations in double-precision, and tested it on randomly-generated convex poly- 
hedra. The tests were done on a SUN Ultra 10 Sparc machine with 256 MB of 
main memory and a 440 MHz processor. 

For our test polyhedra, we used qhull P] to generate n-l-1 points at random 
on a cone whose major axis was along the z-axis, for n in the range 20,000 to 
200,000 (the additional point was the apex of the cone). We then rotated the 
cone by a randomly-chosen angle to make it non-symmetric about the origin. 
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For each n, we generated two non-symmetric test polyhedra in this fashion. 
Table ^shows a subset of our results. (All the results are not shown due to space 
constraints.) For each input size, the running times on the two polyhedra were 
nearly the same; therefore, we averaged the times. This closeness in run times is 
to be expected since the complexity of the algorithm depends primarily on the 
graph structure of V, which is the same regardless of orientation. 



Table 1. Minimum support contact-area and support volume for convex polyhedra 
generated from random points on a cone; each polyhedron has been rotated by the 
indicated angle to make it non-symmetric about the origin. Here “non-decomp area” 
and “non-decomp volume” refer to the contact-area and volume when the polyhedra 
are built without decomposition; observe the significant reductions achieved via de- 
composition. 



Support Contact-area 


Support Volume 


#verts 

n 


angle 


min 

contact-area 


hmin 


mean 
time (s) 


non-decomp 

contact-area 


min 

volume 


hmin 


mean 
time (s) 


non-decomp 

volume 


20001 


200 


7752.6 


-3.2 


.6 


55300.5 


10999.5 


-21.1 


4.2 


965039.4 


37 


11705.0 


2.3 


57111.7 


26389.1 


21.5 


1107383.9 


40001 


40 


12193.1 


2.7 


1.3 


58304.3 


24003.9 


21.2 


00 


947591.6 


75 


3973.7 


.7 


52005.6 


3631.2 


-4.4 


994452.0 


60001 


240 


7037.8 


-.5 


2.0 


55082.1 


12900.6 


-4.1 


12.7 


815518.8 


112 


5315.5 


-.5 


55068.2 


8118.6 


2.4 


865432.8 


80001 


80 


2714.5 


0 


2.8 


52272.1 


1359.2 
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17.3 
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10108.5 


-1.5 


51825.6 


20249.8 


-24.6 


887435.1 


100001 
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2649.1 
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3.6 


52160.8 


1349.3 


-4.2 


21.5 


1014432.7 


187 


3447.8 


-3.6 


59820.0 


1068.5 


-10.7 


1033711.3 



4 Decomposing Convex Polyhedra: Support Volume 

For any given decomposition plane H : z = h, the expression for the support 
volume consists of the inactive-volume term and the active-volume term, which 
are, respectively, the total volumes of the support polyhedra contributed by 
the inactive and the active facets. If we move H up or down without cross- 
ing a vertex, then the active-volume and the inactive- volume both change (un- 
like contact-area, where only the active-area changes) . Lemma 0 and Lemma 0 
(proofs omitted here) establish the dependence of these terms on h. 

Lemma 2. For the deeomposition plane FI : z = h, the aetive-volume term is 
of the form Ah^ -|- Bh^ -\- Ch -|- D, where the eoeffieients A, B, C , and D depend 
only on the coordinates of the vertices of the active facets. 

Lemma 3. For the deeomposition plane H : z = h, the inactive-volume term is 
of the form Ch-\-D, where the eoeffieients C and D depend only on the eoordinates 
of the vertiees of the inactive facets. 
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The volume-minimization algorithm is similar to the one in Section 0 except 
that we minimize the sum of the active-volume and inactive-volume terms. 

Theorem 2. The support volume version of Problem Cl can be solved in 
0(n log n) time for a convex polyhedron V with n vertices (0{n) time if the 
vertices are given in sorted order in the decomposition direction dj. 

This algorithm has also been implemented; Table [Ushows some of the results. 

5 Decomposing Non-convex Polyhedra 

Despite the challenges posed by non-convex polyhedra (Figure 0, we show that 
they can be handled in essentially the same way as convex polyhedra after doing 
some initial processing. We focus on the contact-area version of Problem 0 (The 
volume problem is easier, because parallel facets do not contribute to support 
volume — as they do to contact-area — and so can be ignored.) We partition each 
front or back facet of V into two classes of triangles, called black and gray. (One 
of these classes may be empty.) A black triangle t will always be completely in 
contact with supports, regardless of the position of H\ thus, it always contributes 
area(t) to the total contact-area and can be ignored for minimization purposes. 
A gray triangle t will contribute an amount that is a quadratic function of the 
height of 77, and so needs to be accounted for. A parallel facet of V is partitioned 
into three classes of triangles, called black, gray, and white. (Up to two of these 
classes may be empty.) Black and gray triangles are as above; a white triangle 
will never be in contact with supports, regardless of the height of 77, and so can 
be ignored. Thus, only gray facets are relevant to the minimization problem. We 
will see that these can be handled as in the convex case. 

5.1 Black, Gray, and White Triangles 

Imagine that V is built, without decomposition, in direction d. Consider the 
supports (if any) that are in contact with a front facet / (due to a back facet 
“above” it in direction d). Their footprint on /, i.e., their intersection with /, 
is a collection of polygons, called black polygons and their triangulation yields 
the set, Bf, of black triangles associated with /. For any point p in a black 
triangle, the ray emanating from p in this direction must intersect V. If p' is the 
first intersection of the ray and V — not counting p — then the segment p'p is the 
support for p' . The complement of the black polygons on / is a collection of gray 
polygons, and their triangulation yields the set Gy, of gray triangles associated 
with /. No point in a gray triangle is in contact with supports for build direction 
d. If / is a back facet of V, then a symmetric discussion applies w.r.t. building 
V without decomposition in direction — d. 

Next, consider a parallel facet /. Imagine that we build V without decompo- 
sition in direction d, and, independently, in direction — d. The black (resp. gray, 
white) polygons consist of all points on / that are in contact with supports for 
both (resp. exactly one, neither) direction. The triangulations of these polygons 
yield the sets Bf, Gf, and Wf of black, gray, and white triangles on /. 
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Lemma 4. Let V be built via deeomposition and let H : z = h be any deeom- 
position plane. Consider a blaek or gray triangle from a front, baek, or parallel 
facet, or a white triangle from a parallel facet. The contribution of the black or 
white triangle to the total contact-area is a constant independent of h. The con- 
tribution of the gray triangle is a quadratic function of h if the triangle is active, 
and a constant otherwise. 

Proof. (Sketch) W.l.o.g., let t be a black triangle from a front facet. (The dis- 
cussion for back and parallel facets is similar.) Assume that t is active, and let 
and t~ = tr\V~ . Thus, all of t~ will require support. is built in 
direction d, and the part of P'^ that is directly above T*" is the same as the part 
of V that would be directly above T*" if V were built without decomposition in 
direction d. Since is in contact with supports in the latter case, it is also in 
contact with supports in the case with decomposition. Thus, all of t is in con- 
tact with supports and it always contributes area(t) to the total contact-area. 
A similar argument applies if t is inactive. 

Next, let t be an active gray triangle. As above, t~ G V~ will be in contact 
with supports. However, no part of will be in contact with supports, since 
in the case where V is built in direction d, without decomposition, no part of t 
is in contact with supports. Therefore, t’s contribution to the total contact-area 
depends on the height of H and a proof similar to that of Lemma 01 shows that 
this is quadratic in h. If t is inactive, then it is either not in contact with supports 
at all or is completely in contact, so the contribution is either zero or areaft). 

If t is a white triangle from a parallel facet, then clearly no part of it is in 
contact with supports in or in V~ , so its contribution is always zero. □ 



5.2 Computing Black, Gray, and White Triangles 

We use cylindrical decomposition 0. From each edge, e, of each back facet, b, 
we erect a strip, 14 , 6 ) which passes exactly through e and extends vertically 
downwards, in direction — d. As soon as a part of Ve,t intersects another facet 
of V (which must be a front facet), we stop propagating that part below the 
intersected facet; however, we continue propagating the remaining parts of 
Each such intersection of Ve^t with a front facet will be a part of the footprints 
that we are trying to compute. 

To perform this step efficiently, we compute the intersection of each front 
facet with Vg,b to get a set, L, of line segments. We do a trapezoidal decom- 
position 0 of L U {e} in the plane of We identify each trapezoid in this 
decomposition that is adjacent to e at the top and to a line segment £ G L at the 
bottom. The bottom edge of this trapezoid is one of the sought intersections of 
Ve,b with a front facet. We store this edge with the front facet that generated £. 

However, not all footprints on front facets will be discovered by the above 
process. For instance, if the projection of b completely covers a front facet, /, 
below it, then none of the strips Ve,b erected from b will intersect /, and, yet, 
supports for b will rest on /. To handle such situations, we also erect from each 
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edge, e, of each front facet, /, a strip Vej vertically upwards, stopping the prop- 
agation of any part of the strip as soon as it intersects a (back) facet above it, 
while continuing to propagate other parts. To compute these intersections we do 
a trapezoidal decomposition of the set L' U {e}, where L' contains the intersec- 
tions of all the back facets with Vej. For each trapezoid in this decomposition 
that is incident to e at the bottom and to a segment £' G L' at the top, we take 
the top edge of the trapezoid as the intersection of Vgj with the back facet that 
generated , and store it with that back facet. 

After we have done this for all front and back facets, we have associated with 
each facet a list of line segments corresponding to intersections of the different 
vertical strips with the facet. Since a strip is not propagated below an intersected 
facet, it is easy to see that the line segments associated with a facet are non- 
crossing (but may be touching). For each facet, we compute the arrangement |3] 
of the set consisting of the associated segments and the edges bounding the facet. 

Let / be any front facet and let c be any cell of the arrangement computed 
on /. Then c is the footprint of a support on / for build direction d, hence a 
black polygon, iff there is a cell c' on a back facet h above /, such that c' projects 
to c. (The cells c and c' form the bottom and top facets of a support cylinder; 
the other facets of this cylinder are vertical and bounded below and above by 
edges of c and c'.) Any other cell of / is a gray polygon. 

We can identify the black triangles of / directly (instead of first computing 
the black polygons) using an approach given in [^. We first triangulate the 
cells of the arrangements on all the front facets. We make a list, F, of these 
triangles along with their centroids. We sort F lexicographically on the x-, y-, 
and z-coordinates of the centroids, taken in that order. We make a similar sorted 
list B for the back facets. Then a simultaneous scan of the two lists suffices to 
identify matching pairs of triangles, i.e., pairs where one triangle is from B and 
the other is from F such that the former is above the latter and projects to it. 
For each matching pair, the triangle from A is a black triangle; all unmatched 
triangles of F are gray triangles. 

A symmetric approach yields the black and gray triangles for the back facets. 

To compute the relevant triangles for a parallel facet /, we take the strip 
Vf which is in the plane of / and exactly contains it, intersect it with each 
back facet that is above / to get a set. A, of line segments, and do a trapezoidal 
decomposition of A. Let T/ be the set of trapezoids in this decomposition that are 
bounded from above by some segment of A, unbounded below, and intersect /. 
Compute a symmetric set Tj- w.r.t. the front facets below /. The black polygons 
on / are intersections of pairs from Tf and T'j. Any part of a trapezoid in T/ or 
T'j that is not a black polygon is a gray polygon. The complement of the union 
of the gray and black polygons on / is the set of white polygons. These sets 
can be found by a simple sort-and-scan step and the corresponding triangles can 
then be identified by triangulation. 

Lemma 5. The set of black, gray, and white triangles for all facets of an n- 
vertex polyhedron V can he computed in 0{n^ log n) time. 
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5.3 The Algorithm for Contact-Area 

We compute for each front, back and parallel facet of V, a list of the black, 
gray, and white polygons. As discussed in Sectional only the gray triangles 
are relevant when decomposing V to minimize the contact-area of supports. We 
store the subdivision defined by the union of the set of gray triangles in a doubly- 
connected edge list and perform a sweep over them as in Section El to compute 
the optimum decomposition plane H. There are O(n^) gray triangles and the 
sweep therefore takes time logn). (Even though the algorithm in Section El 
is for convex polyhedra, the sweep does not depend on convexity per se, and so 
it works for our collection of gray triangles as well.) 

As noted earlier, essentially the same approach works for volume minimiza- 
tion also. We will see in Section that the size of the decomposition can be 
controlled in O(nlogn) time. We conclude: 

Theorem 3. The contact-area and support volume versions of Problem Q can 
be solved in 0{n^ log n) time for a non-convex polyhedron V with n vertices. 

5.4 Controlling the Size of the Decomposition 

The optimal plane computed by the algorithm in Section h.di could decompose a 
non-convex polyhedron V into as many as 0{n) polyhedra, which is undesirable 
since it increases the cost of re-assembling V. Ideally, the designer should be 
able to specify an integer K, and the algorithm should compute, among along 
all possible planes that generate no more than K polyhedra, a plane which is 
optimal w.r.t. support contact-area or volume. We show how this can be done via 
a preprocessing step. The idea is to partition the z-axis into Ofn) intervals, Ij, 
such that all planes whose heights are in Ij decompose V into the same number, 
kj, of polyhedra. We then run the sweep algorithm of the previous section but 
do the minimization step only in those intervals Ij for which kj < K. 

Let z\ < Z 2 <■■■< Zt, t < n, he the distinct z-coordinates of the n vertices 
of V ■ The preprocessing involves two sweeps. The first sweep is in the positive 
z direction and it computes a set of intervals on the z-axis and associates with 
each interval an integer which is the number of connected components of V~ 
generated by any plane whose height is in the interval. Observe that the number 
of connected components of V~ w.r.t. a plane of height Zj is the same as the 
number w.r.t. a plane whose height is anywhere in the interval (zj, Zj+i); let this 
number be kJ . Thus, the first sweep computes intervals of the form [zj, Zj+i) and 
associates with each the integer k~ . Symmetrically, the second sweep is in the 
negative z direction and it computes intervals of the form (zj, Zj+i] and associates 
with each an integer k~^ , which is the number of connected components of 
w.r.t. any plane whose height is in (zj, Zj+i]. Once these two sets of intervals have 
been computed, a single scan of them suffices to compute the desired intervals 
Ij and the corresponding integers kj . Specifically, each interval Ij is either of the 
form [zj, Zj], with kj = kJ or of the form {zj, Zj+i), with kj = k~ k^ . 

Consider the first sweep. At any time, the vertices of the different connected 
components of V~ form a collection of disjoint sets. We maintain these using 
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a Union-Find-Makeset data structure. We initialize the structure to empty and 
set the current number, c, of connected components of V~ to zero. Let Zj be the 
current z-coordinate in the sweep and let Vj be the set of vertices of V at this 
z-coordinate. We consider each vertex u S V,- in turn and process it as follows: 
We create a new set containing just v and increment c by one. Then for each 
neighbor, u, of v such that u is already in the Union-Find-Makeset data structure 
we do the following: If u and v are in different connected components, then we 
union the sets containing u and v, and decrement c by one. Notice that in this 
sweep, we only “add” edges to P~ , so the connected components of V~ always 
merge, never split. Thus, a Union-Find-Makeset structure suffices to maintain 
the connected components of V~ . After all vertices of Vj have been processed, 
we set kj to c. At the end of the sweep, all the intervals [zj,Zj+i), and their 
associated integers kJ will have been computed. The running time is dominated 
by the 0(n log n) time for the sorting. 

6 Conclusion 

We have presented a new decomposition-based approach to LM, which reduces 
substantially the support requirements of the process, while also realizing other 
benefits as discussed in Section Throughout, we have assumed a fixed decom- 
position direction d and found an optimum plane that is normal to d. A more 
challenging problem is to compute over all directions d an optimum decompo- 
sition plane (while also limiting the number of pieces). We are pursuing this 
problem currently and plan to report on it in a future paper. 
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Abstract. We explore the following problem: given a collection of 
creases on a piece of paper, each assigned a folding direction of mountain 
or valley, is there a flat folding by a sequence of simple folds? There are 
several models of simple folds; the simplest one-layer simple fold rotates 
a portion of paper about a crease in the paper by ±180°. We first consider 
the analogous questions in one dimension lower — bending a segment into 
a flat object — which lead to interesting problems on strings. We develop 
efficient algorithms for the recognition of simply foldable 1-D crease pat- 
terns, and reconstruction of a sequence of simple folds. Indeed, we prove 
that a 1-D crease pattern is fiat-foldable by any means precisely if it is 
by a sequence of one-layer simple folds. 

Next we explore simple foldability in two dimensions, and find a sur- 
prising contrast: “map” folding and variants are polynomial, but slight 
generalizations are NP-complete. Specifically, we develop a linear-time 
algorithm for deciding foldability of an orthogonal crease pattern on a 
rectangular piece of paper, and prove that it is (weakly) NP-complete 
to decide foldability of (1) an orthogonal crease pattern on a orthog- 
onal piece of paper, (2) a crease pattern of axis-parallel and diagonal 
(45-degree) creases on a square piece of paper, and (3) crease patterns 
without a mountain/valley assignment. 



1 Introduction 

The easiest way to refold a road map is differently. 

— Jones’s Rule of the Road (M. Gardner pj) 

Perhaps the best-studied problem in origami mathematics is the characterization 
of ffat-foldable crease patterns. A crease pattern is a straight-edge embedding of 
a graph on a polygonal piece of paper; a flat folding must fold along all of the 
edges of the graph, but no more. For example, two crease patterns are shown in 
Figure n The first one folds fiat into a classic origami crane, whereas the second 
one cannot be folded fiat (unless the paper is allowed to pass through itself), 
even though every vertex can be “locally” fiat folded. 
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Fig. 1. Sample crease patterns. Left: the classic crane. Right: pattern of Hull jS], which 
cannot be folded flat, for any mountain- valley assignment. 



The algorithmic version of this problem is to determine whether a given 
crease pattern is flat-foldable. The crease pattern may also have a direction of 
“mountain” or “valley” assigned to each crease, which restricts the way in which 
the crease can be folded. (Our figures adhere to the standard origami convention 
that valleys are drawn as dashed lines and mountains are drawn as dot-dashed 
lines.) 

It is known that the general problem of deciding flat foldability of a crease 
pattern is NP-hard (Bern and Hayes 0). In this paper, we consider the important 
and very natural case of recognizing crease patterns that arise as the result of 
flat foldings using simple foldings. In this model, a flat folding is made by a 
sequence of simple folds, each of which folds one or more layers of paper along a 
single line segment. As we define in Section 0 there are different types of simple 
folds (termed “one-layer,” “some-layers,” and “all-layers”), depending on how 
many layers of paper are required or allowed to be folded along a crease. 

Note that not every flat folding can be achieved by a simple folding. For 
example, the crane in Figure m (top) cannot be made by a simple folding. Also, 
the hardness gadgets of 0 require nonsimple folds. 

The problem we study in this paper is that of determining whether a given 
crease pattern (usually with specified mountain and valley assignments) can be 
folded flat by a sequence of simple folds, and if so, to construct such a sequence 
of folds. 

Several of our results are based on the special case in which the creases in the 
piece of paper are all parallel to one another. This case can be seen to be equiv- 
alent to a one- dimensional folding problem of folding a line segment (“paper”) 
according to a set of prescribed crease points (possibly labeled “mountain” or 
“valley” ) . We will therefore refer to this special case, which has a rich structure 
of its own, as the “1-D” case to distinguish it from the general 2-D problem. In 
contrast to the 2-D problem, we show that 1-D flat foldability is equivalent to 
1-D simple foldability. 

Motivation. In addition to its inherent interest in the mathematics of origami, 
our study is motivated by applications in sheet metal and paper product manu- 
facturing, where one is interested in determining if a given structure can be man- 
ufactured using a given machine. (See references cited below.) While origamists 
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can develop particular skill in performing nonsimple folds to make beautiful art- 
work, practical problems of manufacturing with sheet goods require simple and 
constrained folding operations. Our goal is to develop a first suite of results that 
may be helpful towards a fuller algorithmic understanding of the several manu- 
facturing problems that arise, e.g., in making three-dimensional cardboard and 
sheet-metal structures. 

Related Work. Our problem is related to the classic combinatorics question 
of map folding ESI. This question asks for the number of foldings of a particular 
crease pattern, namely an m x n rectangular grid, by a sequence of simple folds. 
See also the discussion in Gardner’s book E]- In contrast with this combinatorics 
question, we study the algorithmic complexity of the decision problem, also in 
some more general instances of crease patterns. 

The mathematical and algorithmic problems arising in the study of flat 
origami have been examined by several researchers, e.g., Hull [^, Justin EDI, 
Kawasaki and Lang m Of particular relevance to our work is the paper of 
Bern and Hayes 0 , in which the general problem of deciding flat foldability of a 
crease pattern is shown to be strongly NP-hard. Demaine et al. Pj used compu- 
tational geometry techniques to show that any polygonal (connected) silhouette 
can be obtained by simple folds from a rectangular piece of paper. 

There has been quite a bit of work on the related problems of manufacturabil- 
ity of sheet metal parts (see, e.g., I2H) and folding cartons (see, e.g., P))- Exist- 
ing CAD/CAM techniques (including BendCad and PART-S) rely on worst-case 
exponential-time state space searches (using the A* algorithm). In general, the 
problem of bend sequence generation is a challenging (and provably hard fP) 
coordinated motion planning problem. For example, Lu and Akella [14^ utilize a 
novel configuration-space formulation of the folding sequence problem for fold- 
ing cartons using fixtures; their search, however, is still worst-case exponential 
time. Our work differs from the prior work on sheet metal and cardboard bend- 
ing in that the structures we are folding are ultimately “flat” in their folded 
states (all bend angles in the input crease pattern are ±180°, according to a 
mountain- valley assignment that is part of the input crease pattern). Also, we 
are concerned only with the feasibility of the motion of the (stiff) material that 
is being folded - does it collide with itself during the folding motion? We are not 
addressing here the issues of reachability by the tools that perform the folding. 
As we show, even with the restrictions that come with the problems we study, 
there is a rich mathematical and algorithmic theory of foldability. 

Summary of Our Results^ 

(1) We analyze the 1-D one-layer and some-layers cases, giving a full character- 
ization of flat-foldability and an 0(n) algorithm for deciding foldability and 
producing a folding sequence, if one exists. 

(2) We analyze the 1-D all-layers case as a “string folding” problem. In addition 
to a simple O(n^) algorithm, we give an algorithm utilizing suffix trees that 



^ Due to space limitations, many of the proofs and details are omitted from this 
extended abstract. The reader is referred to the full paper. 
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requires time linear in the bit complexity of the input, and a randomized 
algorithm with expected 0(n) running time. 

(3) We give a linear-time algorithm for deciding foldability of orthogonal crease 
patterns on a rectangular paper (the “map folding problem”), in the one-, 
some-, and all-layers cases, based on our 1-D results. 

(4) We prove that it is (weakly) NP-complete to decide foldability of an orthog- 
onal crease pattern on a piece of paper that is more general than a rectangle: 
a simple orthogonal polygon. 

(5) We also prove that it is (weakly) NP-complete to decide foldability of a rect- 
angular piece of paper with a crease pattern that includes diagonal creases 
(at 45-degrees), in addition to axis-parallel creases. 

(6) We show that it is (weakly) NP-complete to decide foldability of a orthog- 
onal piece of paper having a crease pattern for which no mountain-valley 
assignment is given. 

2 Definitions 

We are concerned with foldings in one and two dimensions. A one-dimensional 
piece of paper is a (line) segment in R^. A two-dimensional piece of paper is a 
(connected) polygon in R^, possibly with holes. In both cases, the paper is folded 
through one dimension higher than the object; thus, segments are folded through 
R^ and polygons are folded through R^. Creases have one less dimension; thus, 
a crease is a point on a segment and a line segment on a polygon. 

A crease pattern is a collection of creases on the piece of paper, no two of 
which intersect except at a common endpoint. A folding of a crease pattern is an 
isometric embedding of the piece of paper, bent along every crease in the crease 
pattern (and not bent along any segment that is not a crease). In particular, each 
facet of paper must be mapped to a congruent copy, the connectivity between 
facets must be preserved, and the paper cannot pass through itself. See Figure 0 




Fig. 2. Sample nonflat foldings in one and two dimensions. 



A flat folding has the additional property that it lies in the same space as the 
unfolded piece of paper. That is, a flat folding of a segment lies in R^, and a flat 
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folding of a polygon lies in R^. In reality, there can be multiple layers of paper 
at a point, so the folding really occupies a finite number of infinitesimally close 
copies of or R^ . More formally, a flat folding can be specified by a function 

mapping the vertices to their folded positions, together with a partial order of 
the facets of paper that specifies their overlap order mm- 

If we orient the piece of paper to have a top and bottom side, we can talk 
about the direction of a crease in a flat folding. A mountain brings together the 
bottom sides of the two adjacent facets of paper, and a valley brings together 
the top sides. A mountain-valley assignment is a function from the creases in 
a crease pattern to {M,V}. Together, a crease pattern and a mountain-valley 
assignment form a mountain-valley pattern. 

This paper is concerned with the following generic question. 

Problem 1. Simple Folding: Given a mountain-valley pattern, is there a simple 
folding satisfying the specified mountains and valleys? If so, construct such a 
simple folding. 

There are three natural versions of this problem, depending on the type of 
“simple folds” allowed. In general, a simple folding is a sequence of simple folds. 
Each simple fold takes a flat-folded piece of paper, and folds it into another flat 
folding using additional creases. There are three types of simple folds: one-layer, 
all-layers, and some-layers. 

A one-layer simple fold / is a crease on the folded piece of paper, together 
with a direction. If we look at the unfolded piece of paper, then / partitions 
it into two parts, call them A and B. Performing / corresponds to rotating A 
about / by ±180°, where ± depends on the fold direction, if this does not cause 
the paper to cross itself. This makes just a single crease, which is what we mean 
by folding one layer of paper. 

An all-layers simple fold f is also a crease together with a direction. Now 
we consider the partition of the flat folding (instead of the unfolded piece of 
paper) by / into two parts, call them A and B again. Performing / corresponds 
to rotating (all layers of) A about / by ±180°. This makes a crease through all 
of the layers of paper at /. Note that this type of fold can never cause the paper 
to cross itself. 

Finally, a some-layers simple fold f is the most general. It takes some of the 
top [bottom] layers of A, and rotates them about / by 180° [—180°], provided 
the paper does not cross itself. 



3 1-D: One-Layer and Some-Layers 

This section is concerned with the 1-D one-layer simple-fold problem. We will 
prove the surprising result that we only need to search for one of two local 
operations to perform. The two operations are called crimps and end folds, and 
are shown in Figure 0 

More formally, let ci, . . . , c„ denote the creases on the segment, oriented so 
that a is left of Cj for i < j. Let cq [c„+i] denote the left [right] end of the 
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M V 




Cn+1 

Fig. 3. The two local operations for one-dimensional one-layer folds. 



segment. Despite the “c” notation (which is used for convenience), cq and c„+i 
are not considered creases; instead they are called the ends. 

First, a pair (ci,Ci+i) of consecutive creases is crimpable if Ci and Ci+i have 
opposite directions and |cj_i — c^j > \ci — Cj+i| < |cj+i — Cj+ 2 |. Crimping such a 
pair corresponds to folding Ci and then folding Ci+i, using one-layer simple folds. 

Second, Cq is a foldable end if jcg — Ci| < |ci — C 2 I, and c„_|_i is a foldable end 
if |c„_i — c„| > |c„ — c„+i|. Folding such an end corresponds to performing a 
one-layer simple fold at the nearest crease (crease ci for end cq, and crease c„ 
for end c„+i). 

We claim that one of the two local operations exists in any flat-foldable 1- 
D mountain-valley pattern. We claim further that an operation exists for any 
pattern satisfying a certain “mingling property.” Specifically, a 1-D mountain- 
valley pattern is called mingling if for every sequence Cj, c^+i , . . . ,Cj of consec- 
utive creases with the same direction, either (1) \ci-\ — Cj| < \ci — a+il or (2) 
\cj-i — Cj\ > \cj — Cj+i\. We call this the mingling property because, for maximal 
sequences of consecutive creases with the same direction, it says that there are 
folds of the opposite direction nearby. In this sense, the mountain- valley pattern 
is “crowded” (mountains and valleys must “mingle” together). 

First we show (in the full paper) that mingling mountain- valley patterns 
include flat-foldable patterns: 

Lemma 1. Every flat-foldable 1-D mountain-valley pattern is mingling. 

Next we show (again, see the full paper) that having the mingling property 
suffices to imply the existence of a single crimpable pair or foldable end. 

Lemma 2. Any mingling 1-D mountain-valley pattern has either a crimpable 
pair or a foldable end. 

Ideally, we could show at this point that performing either of the two local 
operations preserves the mingling property, and hence a mountain- valley pattern 
is mingling precisely if it is flat-foldable. Unfortunately this is false; see the full 
paper. Instead, we must prove that flat foldability is preserved by each of the 
two local operations; i.e., if we treat the folded object from a single crimp as a 
new segment, it is flat-foldable. 

Lemma 3. Folding a foldable end preserves foldability. 
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Fig. 4. A mingling mountain-valley pattern that when crimped is no longer mingling 
and hence not flat-foldable. Indeed, the original mountain-valley pattern is not flat- 
foldable. 



Lemma 4. Crimping a crimpable pair preserves flat foldability. 

Combining all of the previous results, we have the following: 

Theorem 1. Any flat-foldable 1-D mountain-valley pattern ean be folded by a 
sequence of crimps and end folds. 

A particularly interesting consequence of this theorem is the following: 

Corollary 1. The following are equivalent for a TD mountain-valley pattern 
P: 

1. P has a flat folding. 

2. P has a some-layers simple folding. 

3. P has a one-layer simple folding. 

Finally, let us show (in the full paper) that Theorem Q leads to a simple 
linear-time algorithm: 

Theorem 2. The TD one-layer and some-layers simple-fold problems can be 
solved in 0(n) worst-case time on a machine supporting arithmetic on the input 
lengths. 

4 1-D: All-Layers Simple Folds 

The 1-D all-layers simple-fold problem can be cast as an interesting “string 
folding” problem. (This folding problem is not to be confused with the well- 
known protein/string folding problem in biology jSj.) The input mountain- valley 
pattern can be thought of as a string of lengths interspersed with mountain and 
valley creases. Specifically, we will assume that the input lengths are specified as 
integers or equivalently rational numbers. (Irrational numbers can be replaced 
by close rational approximations, provided the sorted order of the lengths is 
preserved.) 

Thus, an input string is of the form do ci di C 2 • • • c„_i d„_i c„ dn, where 
each Ci G {M,V} and each di is a positive rational number. We call each Ci 
and di a symbol of the string. It will be helpful to introduce some more uniform 
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notation for symbols. For a string S of length n, we denote the ith symbol by 
where 1 < i < n. 

When we make an all-layers simple fold, we cannot “cover up” a crease except 
with a matching crease (which when unfolded is in fact the other direction), 
because otherwise this crease will be impossible to fold later. To formalize this 
condition, we define the complement of symbols in the string: comp(di) = di, 
comp(M) = V, and comp(f4) = M. Finally, we call a crease (M or V symbol) 
5'[i] allowable if S')* — a;] = comp(S[*-|-x]) for all 1 < a; < min(* — 1, n — *), except 
that S[l] and S[n] (the ends) are allowed to be shorter than their complements. 

Lemma 5. A mountain-valley pattern can be folded by a sequence of all-layers 
simple folds precisely if there is an allowable fold, and the result after performing 
that fold has an allowable fold, and so on, until all creases of the segment have 
been folded. 

By Lemma 0 the problem of testing foldability reduces to repeatedly finding 
allowable folds in the string. Testing whether a fold at position i is allowable 
can clearly be done in 0(1 -|- min{* — l,n — *}) time, by testing the boundary 
conditions and whether S[i — a:] = comp(S[* -I- a;]) for 1 < a: < min(* — 1, n — *). 
Explicitly testing all creases in this manner would yield an 0(n^ )-time algorithm 
for finding an allowable fold (if one exists). Repeating this 0(n) times results in 
a naive O(n^) algorithm for testing foldability. 

This cubic bound can be improved by being a bit more careful. In O(n^) 
time, we can determine for each crease S[*] the largest value of k for which 
5'[* — x] = comp(5'[* -I- a;]) for all 1 < a: < fc. Using this information it is easy 
to test whether a crease S')*] is allowable. After making one of these allowable 
folds, we can in 0{n) time update the value of x for each crease, and hence 
maintain the collection of allowable folds in linear time. This gives an overall 
O(n^) bound, which we now proceed to improve further. 

We present two efficient algorithms for folding strings. The algorithm in Sec- 
tion lO is based on suffix trees and runs in time linear in the bit complexity of 
the input. In Section 14.21 we use randomization to obtain a simpler algorithm 
that runs in 0{n) time. 

4.1 Suffix- Tree Algorithm 

In the full paper we prove the following: 

Theorem 3. A string S of length n can be tested for all-layers simple foldability, 
in time that is dominated by that to construct a suffix tree on S. 

The difficulty with the time bound is that sorting the alphabet seems to be 
required. Other than the time to sort the alphabet, it is possible to construct 
a suffix tree in 0{n) time |^. To sort the alphabet in the comparison model, 
0(n log **') time suffices, where n' is the number of distinct input lengths. In 
particular, if the input lengths are encoded in binary, then the algorithm is 
linear in this bit complexity. On a RAM, the current state-of-the-art algorithm 
for integer sorting m uses 0(n(loglogr*)^) time and linear space. 
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4.2 Randomized Algorithm 

In this section we describe a simple randomized algorithm that solves the 1-D 
all-layers simple-fold problem in 0(n) time. There are two parts to the algorithm: 

1. assigning labels to the input lengths so that two lengths are equal precisely 

if they have the same label; and 

2. finding and making allowable folds. 

The first part is essentially element uniqueness, and can be solved in linear 
expected time using hashing. For example, the dynamic hashing method de- 
scribed by Motwani and Raghavan m supports insertions and existence queries 
in 0(1) expected time. We can use this data structure as follows. For each input 
length, check whether it is already in the hash table. If it is not, we assign it a 
new unique identifier, and add it to the hash table. If it is, we use the existing 
unique identifier for that value (stored in the hash table). Let n' denote the 
number of distinct labels found in this process (or 2, whichever is larger). 

For the second part, we will show that each performed fold can be found in 
0(1 -|- r) time, where r is the number of creases removed by the discovered fold 
(in other words, the minimum length to an end of the segment to be folded). 
However, it is possible that the algorithm makes a mistake, and that some of the 
reported folds are not actually possible. Fortunately, mistakes can be detected 
quickly, and after 0(1) expected iterations the pattern will be folded. (Unless of 
course the pattern is not flat-foldable, in which case the algorithm reports this 
fact correctly.) 

In the full paper we give details of the algorithm, and conclude: 

Theorem 4. The TD all-layers simple-fold problem can be solved in 0{n) ex- 
pected time on a machine supporting random numbers and hashing of the input 
lengths. 

5 Orthogonal Simple Folds in 2-D 

In this section, we generalize our results for 1-D simple folds to orthogonal 2-D 
crease patterns, which consist only of horizontal and vertical folds on a rectan- 
gular piece of paper, where horizontal and vertical are defined by the sides of the 
rectangular paper. In such a pattern, the creases must go all the way through 
the paper, because every vertex of a flat-foldable crease pattern has degree at 
least four PO]. Hence, the crease pattern is a map or grid of creases. Recall 
from Section 0 that the opposite holds in 1-D : one-layer and some-layers folds 
are equivalent to general flat-foldability. 

In this section we handle all three cases of simple folds: one-, some-, and 
all-layers folds. To know what time bounds we desire, we must first discuss 
encoding the input. A natural encoding of maps specifies the height of each row 
and the width of each column, thereby using n\ -\- U 2 space for an n\ x ri 2 grid. 
The mountain- valley assignment, however, requires 0(nin2) space to specify the 
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direction for each edge of the grid. Hence, our goal of linear time amounts to 
being linear in n = nin 2 - 

It is easy to see that in exactly one of the two orientations, vertical or hori- 
zontal, there must be at least one crease line, all the way across the paper, that is 
entirely valley or mountain. (If there is no such crease, the pattern is unfoldable. 
And it cannot be that there is both a horizontal crease and a vertical crease all 
the way through the paper, since their intersection would be a vertex that is 
locally unfoldable.) Without loss of generality, assume it is horizontal. Let the 
set of these horizontal fold lines be T~L. 

We claim that all fold lines in T~L must be folded before any other fold. This 
is so because (1) folding along any vertical fold line v will lead to a mismatch of 
creases at the intersection of v with any unfolded elements of T~L and (2) horizontal 
folds not in % are not entirely mountain or valley and hence cannot be folded 
before some vertical fold is made. Thus we have a corresponding 1-D problem 
(one-, some- or all-layer folds) to solve with added necessary condition that the 
non-"H folds must match up appropriately after all the folds in % are made. (The 
time spent of checking for this necessary condition can be attributed to the non-"H 
folds that vanish after every fold.) Since TL contains at least one fold, performing 
the TL folds (strictly) reduces the size of the problem, and we continue. The 
base case consists of just horizontal or vertical folds, which corresponds to a 1-D 
problem. In summary we have: 

Lemma 6. If a crease pattern is foldable, it remains foldable after the folds in 
TL have been made in any feasible way considering TL to be a 1-D problem and 
ignoring other creases. 

To find TL quickly we maintain the number of mountain and valley creases 
for each row and column of creases. We maintain these numbers as we make 
folds in "H. To do this we traverse all the creases that will vanish after a fold and 
decrement the corresponding numbers. The cost of this traversal is attributed to 
the vanishing creases. Every time the number of mountain or valley creases hits 
zero in a column or a row, we add the row or column to a list to be used as the 
new TL in the next step. Thus, 

Theorem 5. The problem of deciding simple foldability of an orthogonal crease 
pattern on a rectangular piece of paper can be solved in linear time. 

6 Hardness of Simple Folds in 2-D 

In this section we prove that the problem of deciding whether a 2-D axis-parallel 
mountain-valley pattern can be simply folded is (weakly) NP-hard, if we allow 
the initial paper to be an arbitrary orthogonal polygon. We also show that it is 
(weakly) NP-hard to decide whether a mountain- valley pattern on a square piece 
of paper can be folded by some-layers simple folds, if the creases are allowed to 
be axis-parallel plus at a 45-degree angle. 
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Both hardness proofs are based on a reduction from an instance of partition: 
given a set X of n integers oi, 02 , . . . , a„ whose sum is A, does there exist a set 
S C X such that J2aeS ® “ A/27 For convenience we define the set S = X\S. 
Also, without loss of generality, we assume that a\ S S. 

The PARTITION problem is known to be (weakly) NP-hard (3- We transform 
an instance of the partition problem into an orthogonal 2-D crease pattern on 
a orthogonal polygon, as shown in Figure 0 




_L 

e 

T 




Fig. 5. Top: Hardness reduction from partition problem. Bottom: Semi-folded stair- 
case confined between y coordinates of Pi and P 2 . The top side of the paper is drawn 
white and the other side is drawn gray. 



In the figure, all creases are valleys. There is a staircase of width e corre- 
sponding to fli, . . . ,a„, where 0 < £ < 2/3. There is a step in the staircase of 
length tti corresponding to each element in A. L is a constant greater than 
A/2. Also W 2 > W\. Also let there be a coordinate system with horizontal x-axis 
and vertical y-axis. 

Lemma 7. If the partition instance has a solution, then the crease pattern in 
Figure\^ is simply foldable. 



Lemma 8. If the crease pattern in Figure 0 is simply foldable, there exists a 
solution to the partition instance. 

Lemmas 0 and 0 imply the following theorem. 
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Theorem 6. The problem of deciding simple foldability of a orthogonal paper 
with an orthogonal crease pattern is (weakly) NP-complete. 

In the full paper, we prove the following theorem, which shows that even on a 
rectangular piece of paper it is hard to decide foldability if, besides axis-parallel, 
there are creases in diagonal directions (45 degrees with respect to the axes): 

Theorem 7. It is (weakly) NP-complete to decide the foldability of an (axis- 
parallel) square sheet of paper with a crease pattern having axis-parallel creases 
and creases at the diagonal angles of 45 degrees with respect to the axes, for both 
all-layers and some-layers simple folds. 

The problem is open for the one-layer case. 

7 No Mountain- Valley Assignments 

An interesting case to consider is when all creases do not have mountain-valley 
assignment: Any crease can be folded in either direction. Even with this flexi- 
bility, we are able to show that the problem is hard (see the full paper for the 
proof): 

Theorem 8. The problem of deciding the foldability of a orthogonal paper with 
a crease pattern that does not have mountain-valley assignment is (weakly) NP- 
complete, for both the all-layers and some-layers cases. 

The problem is open for the one-layer case. 
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Abstract. We introduce the relaxed fc-tree, a search tree with relaxed 
balance and a height bound, when in balance, of (1 + e) log 2 n + 1, for 
any e > 0. The rebalancing work is amortized 0(l/e) per update. This is 
the hrst binary search tree with relaxed balance having a height bound 
better than c- logj n for a fixed constant c. In all previous proposals, the 
constant is at least 1/ log 2 ij> > 1-44, where 4> is the golden ratio. 

As a consequence, we can also define a standard (non-relaxed) fc-tree with 
amortized constant rebalancing per update, which is an improvement 
over the original definition. 

Search engines based on main-memory databases with strongly fluctuat- 
ing workloads are possible applications for this line of work. 



1 Introduction 

The k-trees |7| differ from other binary search trees in that the height can be 
maintained arbitrarily close to the optimal [log 2 n] while the number of rebal- 
ancing operations carried out in response to an update remains O(logn). The 
price to be paid is that the size of each rebalancing operation (the number of 
nodes which must be inspected) grows as we approach |"log 2 n] . More precisely, 
one can show that to obtain height less than (1 -b £)log 2 n + 1, rebalancing 
operations have size 0(l/e) in [Z|- 

Thus, using fc-trees, a trade-off between search time and rebalancing time 
becomes an option; the more interesting direction being the scenario where up- 
dates are infrequent compared with searches. Under such circumstances, it may 
be beneficial to spend more time on the occasional updates in order to obtain 
shorter search paths. 

Search engines pose a particular search/update problem. Searching is domi- 
nant, but when updates are made, they often come in bursts because keywords 
originate from large sites. Equipping search trees with relaxed balance has been 
proposed as a way of being able to adapt smoothly to this ever-changing scenario. 
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Relaxed balance is the term used for search trees where updating and rebal- 
ancing have been uncoupled, most often through a generalization of the basic 
structure. The uncoupling is achieved by allowing rebalancing after different 
updates to be postponed, broken down into small steps, and interleaved. 

The challenge for the designers of these structures is to ensure and be able 
to prove efficient rebalancing in this more general structure. If that problem is 
overcome, then the benefit is the extra flexibility. During periods with heavy 
updating, rebalancing can be decreased or even turned completely off to allow 
a higher throughput of updates as well as searches. When the update burst is 
over, the structure can gradually be rebalanced again. Since a search engine is in 
constant use, it is important that this rebalancing is also carried out efficiently, 
i.e., using as few rebalancing operations as possible. 

Besides search engines, the flexibility provided by relaxed balance may be 
an attractive option for any database application with strongly fluctuating work 
loads. 

Relaxed balance has been studied in the context of AVL-trees P starting 
in PH] and with complexities matching the ones from the standard case in jn|. 
The height bound for AVL-trees in balance is log^ ^ log 2 ^ > 1.441og2n. In the 
context of red-black trees 0, relaxed balance has been studied starting in jSl 
0] with results gradually matching the standard case mi in PEEl- The height 
bound for red-black trees in balance is 2 log 2 n. A more thorough introduction to 
relaxed balance as well as a comprehensive list of references can be found in p| . 

In this paper, we first develop an alternative definition of standard fc-trees. 
The purpose of this is both to cut down on the number of special cases, and 
to pave the way for an improved complexity result. Based on this, a relaxed 
proposal is given, and complexity results are shown. The complexity results are 
in the form of upper bounds on the number of rebalancing operations which 
must be carried out in response to an update. 

It is worth noting that the alternative definition of A:-trees, which is the 
starting point for the relaxed definition, also gives rise to an improved complexity 
result for the standard case: in addition to the logarithmic worst-case bound, 
rebalancing can now be shown to use amortized constant time as well. 



2 1^-Trees 

The fc-trees of (Z) are search trees where all external nodes have the same depth 
and all internal nodes are either unary or binary. The trees are leaf-oriented, 
meaning that the external nodes contain the keys of the elements stored, and 
the internal nodes contain routers, which are keys guiding the search. Binary 
internal nodes contain one key, unary internal nodes contain no keys. To avoid 
arbitrarily deep trees, restrictions are imposed on the number of unary nodes: 
on any level of the tree, the first k nodes to the right of a unary node should (if 
present) be binary. As this does not preclude a string of unary nodes starting at 
the root, it is also a requirement in that the rightmost node on each level is 
binary. 
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It is intuitively clear that a larger value of k gives a lower density of unary 
nodes, which implies a smaller height for a given number n of stored keys. 
The price to be paid is an increased amount of rebalancing work per update — 
more precisely, one can show that with the proposal of j7] a height bound of 
(1 + 0(1/A;)) log 2 n + 1 can be maintained with O(fclogn) work per update. 
Thus, fc-trees offer a tradeoff between the height bound and the rebalancing 
time, and furthermore allow for height bounds of the form c • log 2 n + 1, where 
the constant c can be arbitrarily close to one. 

While the possibility of such a tradeoff is a very interesting property, the 
fc-trees of jZ| also have some disadvantages. One is that, unlike red-black trees, 
for example, they do not have an amortized constant bound on the amount of 
rebalancing work per update. As a counterexample, consider a series of alter- 
nating deletions and insertions of the smallest key in a complete binary tree of 
height h, having 2^ — 1 binary nodes. Since the tree does not contain any unary 
nodes, it is a valid fc-tree for any k, and it is easy to verify that the rebalancing 
operations described in [Z| will propagate all the way to the root after each up- 
date. Another disadvantage is the lack of left-right symmetry in the definition of 
k-trees in ^|, forcing operations at the rightmost path in the tree to be special 
cases. This approximately doubles the number of operations compared with the 
number of essentially different operations. 

We therefore propose an alternative definition of fc-trees. This definition will 
allow us to add relaxed balance using a relatively simple set of rebalancing 
operations, for which we can prove an amortized complexity of 0(1) per update. 
Additionally, this enables us to define a new non-relaxed fc-tree with the same 
complexity, simply by deciding to rebalance completely after each update. This 
is an improvement of the result from |7] . 

Our basic change is in the way the density of unary nodes is kept low. On 
each level in the tree, except the topmost, we divide the nodes into groups of 
0(fc) neighboring nodes. Thus, a group is simply a contiguous segment of a given 
level. The groups are implemented by marking the leftmost node in each group, 
using one bit. Furthermore, in each group, we allow two unary nodes, contrary 
to the original proposal [Z] which considers unary nodes one by one. Intuitively, 
this is what gives the amortized constant rebalancing per update. The top of the 
tree is managed differently, as the levels are too small to contain a group. 

Definition 1. For any integer k > 2, a symmetric fc-tree is a tree containing 
unary and binary nodes, where all external nodes have the same depth. The 
topmost 1-1- [log fc] levels consist of binary nodes only. In level number 2+ [log fc] 
from the top, there is at least one binary node. On the rest of the levels in the 
tree, the internal nodes are divided into groups of neighboring nodes. Each group 
contains at least 2k nodes and at most 4fc nodes. In each group, at most two of 
the nodes are unary. 

We call level number 2 -|- [log fc] the buffer level. For the number S of nodes 
in the buffer level, we have 2k < S = < 4fc. 

The tree is turned into a search tree by storing elements in the external nodes 
and routers in the binary internal nodes in accordance with the usual in-order 
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ordering. Searching proceeds as in any binary search tree, except that unary 
nodes (which contain no keys) are just passed through. 

To add relaxed balance, we must allow insertions and deletions to proceed 
without immediate rebalancing, and therefore must relax the structural con- 
straints of Definition ^ To achieve this, we allow nodes of degree larger than 
two, and allow an arbitrary number of unary nodes in a group. To keep the 
actual trees binary, we use the standard method 0 of representing i-ary nodes 
by binary subtrees with i — I nodes, indicating the root of each subtree by one 
bit of information, traditionally termed a red/black color. 

We define the black depth of a node as the number of black nodes (including 
itself, if black) on the path to the root. The black level number i consists of all 
black nodes having the black depth i. Note that for any node, the number of 
black nodes below it on a path to an external node is the same for all such paths. 
We call this number the black height of the node. 

Definition 2. For any integer k >2, a relaxed fc-tree is a tree containing unary 
and binary nodes, where nodes are colored either black or red. The root, the unary 
nodes and the external nodes are always black. All external nodes in the tree have 
the same black depth. In the topmost 1 -I- [log k~\ black levels there are no unary 
nodes, and no node has a red child. In black level number 2 -|- [log k~\ , there is 
at least one binary node. On the rest of the black levels in the tree, the internal 
nodes are divided into groups of neighboring nodes, with each group containing 
at least 2k nodes and at most Ak nodes. 

A relaxed fc-tree is a standard (symmetric) fc-tree if all nodes are black, and 
no group contains more than two unary nodes. It turns out that in our relaxed 
search trees, we also need to allow empty external nodes, i.e., external nodes 
with no elements. Later in this paper, we give a set of rebalancing operations 
which can turn a relaxed fc-tree with empty external nodes into a standard fc-tree 
without empty external nodes, and we give bounds on the number of operations 
needed for this. 

3 Height Bound 

By the height of a tree we mean the maximal number of edges on any path from 
the root to an external node. We now show that the height of symmetric fc-trees 
is just as good as that of the original version in JZ]- 

Theorem 1. The height of a symmetric k-tree with n external nodes is bounded 
by logo, n -I- 1, where a = 2 — 1/fc. 

Proof. On any level, except the buffer level, at most two out of each 2k nodes 
are unary. It follows that the number of nodes for each new level, except the 
buffer level, increases at least by a factor of 2(1 — l/fc)-|-l/fc = 2 — l/fc = a. For 
the buffer level, the number of nodes does not decrease. Hence, a tree of height 
h contains at least external nodes. □ 
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Using the identity log„(a;) = log 2 (a;)/ log 2 (a), this height bound may be 
stated in a more standard way as c-log 2 n + 1. Examples of values of c attainable 
by varying k are shown in Table ^ 



Table 1. Corresponding values of k and c. 



k 


2 


3 


4 


5 


6 


7 


8 


9 


10 


20 


50 


100 


c 


1.71 


1.36 


1.24 


1.18 


1.14 


1.12 


1.10 


1.09 


1.08 


1.038 


1.015 


1.007 



The asymptotic relationship between k and c is as follows: 

Corollary 1. In symmetric k-trees, the height is bounded by 

(1 + 6 >(l/fc))log 2 n+ 1. 

Proof. This follows from Theorem Q] by the identity log„(a:) = log 2 (a;)/log 2 (Q!) 
and the first order approximations 

log 2 (l + e) = 0 + £/ln2 + O(e^), 

1/(1 - s ) = 1 + £ + 0 (£^). 

□ 



4 Operations 

As mentioned above, a search operation proceeds as in any binary search tree, 
except that unary nodes are just passed through. 

An insert operation starts by a search which ends in an external node v. If 
V is empty, the new element is placed there. Otherwise, a new external node 
containing the new element is made. In that case, if the parent of v is unary, it 
is made binary, and the new external node becomes a child of the binary node. 
The key of the new element is inserted as router in the binary node. If the parent 
of V is binary, a new red binary node is inserted below it, having v and the new 
external node as children and the key of the new element as router. 



i ^ 1 



t \ 




Fig. 1. The insert operation. 



A delete operation first searches for the external node v, containing the ele- 
ment to be deleted. If the parent of v is unary, the leaf becomes empty. Otherwise, 
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V is removed, and the binary parent is made unary (discarding the router in it), 
in case it is black, and is removed completely, in case it is red. 

Fig. 2. The delete operation. 






Fundamental to the rebalancing operations on k-trees is the observation 
that the position of unary nodes may be moved horizontally in the tree by 
shifting subtrees. In Fig. 0 the position of the unary node on the left is moved 
six nodes to the right. Letters denote subtrees. 

A BCDEFG H IJKL ABCDEFGH I JK L 

Fig. 3. The slide operation. 



We call this operation a slide operation, and we use it to move the positions 
of unary nodes horizontally among the black nodes of the same black depth. 

Note that for a slide involving i neighboring nodes, it is necessary to redis- 
tribute some keys to keep the in-ordering of the keys in the tree. The keys in 
question are contained in the binary nodes among the nodes involved in the slide, 
as well as in the least common ancestors of each of the consecutive pair of nodes 
involved in the slide. This is at most 2z — 1 keys in total. Excluding the time for 
locating these least common ancestors, the slide can be performed in 0{i) time. 
We address the question of the time for locating these common ancestors later. 
In 0, this question is not considered at all. 

A relaxed k-tree may contain two kinds of structural problems which keep 
it from being a standard (symmetric) fc-tree: red binary nodes and groups con- 
taining more than two unary nodes. Additionally, it may contain empty external 
nodes. We now describe the set of rebalancing operations which we use to remove 
these three problem types. 

We only deal with red binary nodes having a black parent. If the parent of 
the red node is unary, we use a contract operation, which merges the node and 
the parent into a black binary node. 

If the parent is binary, we first check if there is a unary node in its group. 
If so, we use the slide operation to make the parent unary, and then perform a 
contract operation. 

If the parent is binary, and there is no unary node in its group, we apply 
the following operation, which makes the parent red and the node itself black. 
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r* 

Fig. 4. The contract operation. 



Furthermore, if the sibling is red, the operation makes it black. Otherwise, the 
operation inserts a unary node above the sibling: 



Fig. 5. The split operation. 



We call this operation a split operation, since it corresponds to the splitting of 
an i-ary node (i > 3) in a formulation with multi-way nodes instead of red/black 
colors. 

For a group containing more than two unary nodes, we use the merge oper- 
ation from Fig. which merges two unary siblings into a black binary node. If 
the parent is black, it is converted to a unary node. If it is red, it is removed. 

Fig. 6. The merge operation. 



Note that by using the slide operation within a group, we can decide freely 
which nodes within the group should be the unary ones. Note also that since 
the group contains at least four nodes (as k > 2), there will be at least two 
neighboring nodes which have parents belonging to the same group on the level 
above. Using the slide operation within that group, we can ensure that if it 
contains any binary node at all, it will be the parent of the two neighboring 
nodes. Thus, only if this group on the level above does not contain any binary 
nodes will we not be able to perform a merge operation on a group containing 
more than two unary nodes. 

We note that split and merge operations will make the number of nodes in 
the affected group increase, respectively decrease, by one. To keep the group 
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sizes within the required bounds, we use a policy similar to that of -B-trees, 
i.e., a group which reaches a size of 4fc -I- 1 nodes is split evenly in two, and 
a group which reaches a size of 2fc — 1 nodes will either borrow a node from 
a neighboring group, or, if this is not possible because no neighboring group 
of more than 2k nodes exist, will be merged with a neighboring group. This 
entails simply setting, removing, or moving a group border, i.e., a bit in a node. 
When borrowing a node from a neighboring group containing fewer than two 
unary nodes, we ensure that the borrowed node is binary, by first using a slide 
operation, if necessary. The maintenance of group sizes is performed as part of 
the split and merge operations. 

Regarding empty external nodes, we note that these will always be children 
of black nodes; they are created that way, and no operation changes this. If the 
parent of the empty external node is binary, we remove the external node and 
make the parent unary. If the parent is unary, but a binary node exist in its 
group, we use the slide operation to make the parent binary, and then proceed 
as above. Only if the parent’s group does not contain any binary nodes will we 
not be able to remove the empty external node. 

Fig. 7. The removal of an empty external node. 



For problems immediately below the buffer level, special root operations ap- 
ply. If the problem is a red node, we use the contract operation, and as usual 
use a slide to make the parent of the red node unary, if necessary. However, 
if no unary node exists in the buffer level, this is not possible. In that case, a 
new buffer level consisting entirely of unary nodes is inserted above the previ- 
ous buffer level. We then use the split operation to move the red node past the 
previous buffer level, and then use a contract operation on the new buffer level 
to remove the red node. Note that this maintains the invariant that the buffer 
level should contain at least one binary node. 

Conversely, if a merge operation removes the last binary node of the buffer 
level, we first check if any of the unary nodes in the buffer level has a red child. 
If so, we perform a contract operation on that node. If this is not possible, we 
remove the nodes in the current buffer level (these are all unary), and let the 
black level below be the new buffer level. As the merge operation introduced a 
binary node on this level, the invariant that the buffer level should contain at 
least one binary node is maintained. 

Note that the black height of the tree can only change via a root operation. 

It is clear from inspection that the update and rebalancing operations do not 
violate the invariant that all external nodes have the same black depth. The set 
of rebalancing operations is also complete in the following sense: 
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Lemma 1. On any relaxed k-tree which is not a standard symmetric k-tree 
without empty external nodes, at least one of the rebalancing operations can be 
applied on the tree. 

Proof. A rebalancing operation can always be performed at the topmost red 
node and at the topmost group which contains more than two unary nodes. If 
no group with more than two unary nodes exists, an empty external node can 
always be removed. □ 

5 Complexity Analysis 

The number of rebalancing operations per update is amortized constant: 

Theorem 2. During a sequence of i insertions and d deletions performed on 
an initially empty relaxed k-tree, at most 6i + 4d rebalancing operations can be 
performed before the tree is in balance. 

Proof. The number of removals of empty external nodes clearly cannot exceed 
d. To bound the rest of the operations, we define a suitable potential function 
on any relaxed fc-tree T. Let the unary potential of a group be |m — 1|, where 
u denotes the number of unary nodes in the group. Denote by d>i(T) the sum 
of the unary potential of all groups in T (the buffer level does not constitute 
at group, and neither do the levels above it). Denote by d> 2 {T) the number of 
red nodes in T, by ^ 3 (T) the number of groups in T containing 2k nodes, and 
by 'Piifr) the number of groups in T containing Ak nodes. Define the potential 
function <P{T) by 

<P{T) = 3 • <Pi{T) + 6 • <P2{T) + 1 • <P3{T) + 2 • <?4(T). 

By a lengthy inspection it can be verified that all rebalancing operations, 
including any necessary group splitting, merging or sharing, will decrease <?(T) 
by at least one. We analyze one case to give the flavor of the argument, and 
leave the rest to the full paper. Consider the case of the split operation depicted 
in the middle of Fig. As the group of the top node in the operation does not 
contain any unary nodes before the operation, the added unary node after the 
operation reduces ^i(T) by one. The number of red nodes do not change, so 
neither does <P^(fP). The group size increases by one, hence may grow by 

one, or the group may have to be split (if the size of the group raises to 4/c + 1). 
In the latter case, the sizes of the new groups will be 2k and 2fc+ 1, which makes 
<? 3 (T) grow by one while reducing ‘Pa{T) by one, and the new groups will contain 
zero and one unary node, respectively, which will make ^i(T) grow by one (for 
a total change of zero). By the weights of d>i{T), . . . , in d>{T), this gives a 

reduction of <P{T) of at least one in all cases. 

By inspection, it can also be seen that each insert operation increases <P{T) 
by at most six, and that each delete operation either increases <P{T) by at most 
three, or does not change d>{T), but introduces an empty external node, the 
removal of which later increases <P{T) by at most three. As ^(T) is zero for the 
empty tree and is never negative, the result follows. □ 
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The proof above may be refined to give the following: 

Theorem 3. During a sequence of i insertions and d deletions performed on an 
initially empty relaxed k-tree, at most 0{{i + d)(6/7)^) rebalancing operations 
can be performed at black height h before the tree is in balance. 

Proof. The idea of the proof is to define a potential functions <P^{T) for h = 
0, 1,2,3, . . ., where is defined as <P{T) in the proof above, except that it 

only counts potential residing at black height h. By inspection, it can be verified 
that a rebalancing operation at black height h always decreases <P^{T) by some 
amount Z\ > 1, and that, while it may increase the value of it will 

never do so by more than 6Z\/7. As the <?^(T)’s are initially zero and are never 
negative, this implies the statement in the theorem. The details will appear in 
the full paper. □ 



Theorem 4. If n updates are made on a balanced relaxed k-tree containing N 
keys, then at most 0{nlog{N n)) rebalancing operations can be made before 
the tree is again balanced. 

Proof. The problems in a non-balanced tree consist of red nodes and excess 
unary nodes in groups. Assigning to each such problem a height equal to the 
black height of its corresponding node, it can be verified that each rebalancing 
operation which does not reduce the number of problems will increase the height 
of some problem by one, and that no rebalancing operation will decrease the 
height of any problem. 

Problems arise during updates at black height zero, and each update intro- 
duce at most one problem. Thus, if n updates are performed on an initially 
balanced tree T, the number of update operations cannot exceed n times the 
maximum black height of T since the start of the sequence of updates. 

To bound this maximum height, we recall that the black height of the tree 
can only increase during root operations. Specifically, if the black height of the 
tree reaches some value h, then there has been a root operation at black height 
/i — 1. It is easily verified that the value of <P{T) for a balanced tree T is linear 
in the number of keys N in the tree. During the n updates, this value may only 
grow by 0(n), by the analysis in the proof of Theorem |21 By an argument similar 
to that in the proof of Theorem 0 the maximum black height since the start 
of the sequence of updates is 0(log7/g(fV -|- n)), which proves the theorem. The 
details will appear in the full paper. □ 

6 Comments on Implementation 

In the previous section, we have been concerned with the number of operations 
which have to be carried out, and we have discussed configurations in the tree 
at a fairly abstract level. In order to carry out each operation efficiently, it is 
necessary to be able to find other nodes at the same level, to find least common 
ancestors, and to locate problems in the tree. 
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First, we note that by maintaining parent pointers and level links between 
the black nodes sharing the same black height, all rebalancing operation can be 
performed in 0{k) time, when not counting the time to find the necessary least 
common ancestors during a slide operation. The same is true for a group resizing 
operation. 

We now discuss how to find the necessary least common ancestor (LCA) of 
a neighboring pair of black nodes participating in a slide operation. 

One approach is heuristic: simply search upwards for each of these LCAs. The 
worst case time for this is poor (the search may take time proportional to the 
height of the tree for each LCA), but should be good on average in the following 
sense: If on some level i in a complete binary tree we consider k neighboring 
nodes, then the LCAs of these k nodes will reside in at most two subtrees with 
roots at most |'log(A:)] levels above i, except that one of the LCAs may reside 
higher (it could be the root of the entire tree). If by <5 we denote the difference be- 
tween i— |"log(fc)] and the level of this singular LCA, then it is easily verified that 
the expected value of i5 over all possible start positions of the k neighboring nodes 
on level i is 0(1). As the parts of the two subtrees residing above level i may be 
traversed in 0{k) time, the time for a randomly placed slide involving k nodes is 
expected 0(fc) in a binary tree. As a k-tree is structurally close to a binary tree 
(especially for large k), we therefore believe that the time for finding the LCAs 
during a slide is not likely to be a problem, unless during the use of a relaxed 
fc-tree we allow the tree to become very unbalanced before rebalancing again. 

Another approach is to maintain explicit links from every black node to the 
two LCAs between itself and its two black neighbors, allowing each LCA to be 
found in constant time during a slide operation. These links then have to be 
updated for the black nodes on the leftmost and rightmost path on the subtrees 
exchanged between neighboring black nodes during a slide. Assuming that the 
black nodes on such a left- or rightmost path can be accessed in constant time 
per node, this gives a time for a slide involving k neighboring nodes which is 
proportional to k times the black height at which the slide takes place. However, 
as ^(6/7)^ = 0(1), Theorem 0 implies that the amortized rebalancing 

work is still 0{k) per update. 

So, we must be able to traverse only the black nodes on the left- and rightmost 
paths mentioned above. For every black node, we keep all its immediate red 
descendants (those forming a single node in a formulation with multi-way nodes 
instead of red/black colors) in a doubly linked list. The list is ordered (the list 
may be seen as adding in-order links to all red connected components of the tree), 
and the front and rear of the list is pointed to from the black node rooting the red 
connected component. Using these front and rear pointers, it is now possible to 
jump from black to black level during a traversal of right- and leftmost paths, as 
assumed above. Furthermore, it can be verified that these list can be maintained 
during update and rebalancing operations, including slides. 

Finally, locating problems in the tree is complicated by the main feature of 
relaxed balance, namely that the rebalancing is uncoupled from the updating. 
Hence, an update operation simply ignores any problem which arises as a con- 
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sequence of the update. To be able to return to these problems later, a problem 
queue can be maintained. For problems discovered during an update, a pointer 
is stored in the queue. One pointer per group suffices. Since update and rebal- 
ancing operations may remove other problems, it is convenient to be able to 
remove problems from the queue. Thus, each group must have a back-pointer 
into the problem queue. Rebalancing operations start by dequeuing a pointer, 
which is then followed, and the appropriate rebalancing operation is performed. 
Rebalancing operations should also insert new pointers when they move prob- 
lems upwards in the tree, unless the receiving group is already in the queue. 
Note that when a problem is added to the tree, it can be verified in 0{k) time 
whether the affected group has a problem already. 

Acknowledgment. The first and third author were partially supported by the 
1ST Programme of the EU under contract number IST-1999-14186 (Alcom- 
FT). The third author was partially supported by the Danish Natural Science 
Research Council (SNF). 

References 

1. G. M. Aderson-Vel’skii and E. M. Landis. An Algorithm for the Organisation 
of Information. Doklady Akadamii Nauk SSSR, 146:263-266, 1962. In Russian. 
English translation in Soviet Math. Doklady, 3:1259-1263, 1962. 

2. Joan Boyar, Rolf Fagerberg, and Kim S. Larsen. Amortization Results for Chro- 
matic Search Trees, with an Application to Priority Queues. Journal of Computer 
and System Scienees, 55 (3): 504-521, 1997. 

3. Joan F. Boyar and Kim S. Larsen. Efficient Rebalancing of Chromatic Search 
Trees. Journal of Computer and System Sciences, 49(3):667-682, 1994. 

4. Leo J. Guibas and Robert Sedgewick. A Dichromatic Framework for Balanced 
Trees. In Proceedings of the 19th Annual IEEE Symposium on the Eoundations of 
Computer Science, pages 8-21, 1978. 

5. Kim S. Larsen. Amortized Constant Relaxed Rebalancing using Standard Rota- 
tions. Acta Informatica, 35(10):859-874, 1998. 

6. Kim S. Larsen. AVL Trees with Relaxed Balance. Journal of Computer and System 
Sciences, 61(3):508-522, 2000. 

7. H. A. Maurer, Th. Ottmann, and H.-W. Six. Implementing Dictionaries using 
Binary Trees of Very Small Height. Information Processing Letters, 5(1):11-14, 
1976. 

8. Otto Nurmi and Eljas Soisalon-Soininen. Uncoupling Updating and Rebalancing 
in Chromatic Binary Search Trees. In Proceedings of the Tenth ACM SIGACT- 
SIGMOD-SIGART Symposium on Principles of Database Systems, pages 192-198, 
1991. 

9. Otto Nurmi and Eljas Soisalon-Soininen. Chromatic Binary Search Trees — A 
Structure for Concurrent Rebalancing. Acta Informatica, 33(6):547-557, 1996. 

10. Otto Nurmi, Eljas Soisalon-Soininen, and Derick Wood. Relaxed AVL Trees, Main- 
Memory Databases and Concurrency. International Journal of Computer Mathe- 
matics, 62:23-44, 1996. 

11. Neil Sarnak and Robert E. Tarjan. Planar Point Location Using Persistent Search 
Trees. Communications of the ACM, 29:669-679, 1986. 




Succinct Dynamic Data Structures* 



Rajeev Raman^, Venkatesh Raman^, and S. Srinivasa Rao^ 



^ Department of Mathematics and Computer Science 
University of Leicester, Leicester LEI 7RH, UK. 
r . ramanSmcs . le . ac . uk. 

^ Institute of Mathematical Sciences, Chennai, India 600 113, 
{vraman, ssrao}@imsc . ernet . in 



Abstract. We develop succinct data structures to represent (i) a se- 
quence of values to support partial sum and select queries and update 
(changing values) and (ii) a dynamic array consisting of a sequence of 
elements which supports insertion, deletion and access of an element at 
any given index. 

For the partial sums problem on n non-negative integers of k bits each, 
we support update operations in 0{b) time and sum in 0(log(,n) time, 
for any parameter b, Ign/lglgn < b < n' for any fixed positive e < 1. 
The space used is kn + o{kn) bits and the time bounds are optimal. When 
fo = Ign/ Ig Ig n or fc = 1 (i.e., when we are dealing with a bit- vector), 
we can also support the select operation in the same time as the sum 
operation, but the update time becomes amortised. 

For the dynamic array problem, we give two structures both using o(n) 
bits of extra space where n is the number of elements in the array: 
one supports lookup in constant worst case time and updates in O(n') 
worst case time, and the other supports all operations in 0(lg n/ Iglg n) 
amortized time. The time bound of both these structures are optimal. 



1 Introduction 

Recently there has been a surge of interest in the study of succinct data struc- 
tures The aim is to design data structures that are asymp- 

totically optimal with respect to operation times, but whose space usage is opti- 
mal to within lower-order additive terms. Barring a few exceptions |2I13| . most 
of these are static structures. In this paper we look at succinct solutions to 
two classical interrelated dynamic data structuring problems, namely maintain- 
ing partial sums and dynamic arrays. We assume a RAM model with word size 
0(lg n) bits, where n is the input size. In this model, reading and writing 0(lg n) 
consecutively stored bits, arithmetic and bit-wise boolean operations on O(lgn)- 
bit operands can be performed in constant time. In more detail, the problems 
considered are: 

* Research supported in part by UK-India Science and Technology Research Fund 
project number 2001.04/IT and in part by UK EPSRC grant GR L/92150. 
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Partial Sums. This problem has two positive integer parameters, the item 
size k = O(lgn), and the maximum inerement Smax = n. The problem 

consists in maintaining a sequence of n numbers A[l], . . . such that 0 < 

A\i\ <2^ — 1 under the operations: 

— sum{i)\ return the value ^[j]- 

— update{i, S): set A[i] t— A[i] + 6, for some integer 6 such that 0 < A[i] + 5 < 

2'= - 1 and \S\ < Smax- 

We also consider adding the following operation: 

— select(j): find the smallest i such that sum(i) > j. 

In what follows, we refer to the partial sums problem with select as the searchable 
partial sums problem. 

Dietz P] has given a structure for the partial sums problem that supports 
sum and update in 0(lgn/lglgn) worst-case time using 0{n\gn) bits of extra 
space, for the case k = 0(lgn). As the information-theoretic space lower bound 
is kn bits, Dietz’s data structure uses a constant factor extra space even when 
k = 0(lgn), and is worse for smaller k. We modify Dietz’s structure to obtain a 
data structure that solves the searchable partial sums problem in 0(lgn/lglgn) 
worst case time using kn o{kn) bits of space. Thus, we improve the space 
utilisation and add the select operation as well. 

For the partial sums problem we can trade off query and update times as 
follows: for any parameter b > Ign/ Iglgn we can support sum in O(lgjn) time 
and update in 0{b) timeQ. The space used is the minimum possible to within a 
lower-order term. 

Our time bounds are optimal in the following sense. Fredman and Saks j^] 
gave lower bounds for this problem in the cell probe model with logarithmic word 
size, a much stronger model than ours. For the partial sums problem, they show 
that an intermixed sequence of n updates and queries requires I7(lgn/ Iglgn) 
amortized time per operation. Furthermore, they give a more general trade-off jHl 
Proof of Thm 3'] between the number of memory locations that must be written 
and read by an intermixed sequence of updates and queries. Our data structure 
achieves the optimal trade-off between reads and writes, for the above range of 
parameter values. If we require that queries be performed using read-only access 
to the data structure — a requirement satisfied by our query algorithms — then 
the query and update times we achieve are also optimal. 

Next, we consider a special case of the searchable partial sums problem that 
is of particular interest. 



Dynamic Bit Vector. Given a bit vector of length n, support the following 
operations: 

^ In the partial sum and bit- vector results, other trade-offs that allow expensive queries 
and cheap updates are possible. These are mentioned in the appropriate sections. 
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— rank(i): find the number of I’s occurring before and including the ith bit 

— select(j): find the position of jth. one in the bit vector and 

— flip{i): flip the bit at position i in the bit vector. 

A bit vector supporting rank and select is a fundamental building block for 
succinct static tree and set representations Given a (static) bit vector, we 

can support the rank and select operations in 0(1) time using o(n) bits of extra 
space isnm. 

As the dynamic bit vector problem is simply the searchable partial sums 
problem with fc = 1, we immediately obtain a data structure that supports 
rank, select and flip operations in 0(lgn/lglgn) worst case time using o(n) 
bits of extra space. For the bit vector, however, we are able to give a trade- 
off for all three operations. Namely, for any parameter b > Ign/lglgn we can 
support rank and select in O(lg^n) time and update in amortised 0(b) time. In 
particular, we can support rank and select in constant time if we allow updates 
to take 0{n^) amortised time for any constant e > 0. 

If we remove the select operation from the dynamic bit vector problem, we 
obtain the subset rank problem considered by Fredman and Saks j^. From their 
lower bound on the subset rank problem, we conclude that our time bounds are 
optimal, in the sense described above. 

Next we consider another fundamental problem addressed by Fredman and 
Saks 0. 

Dynamic Array. Given an initially empty sequence of records, support the 
following operations: 

— insert{x,i)'. insert a new record x at position i in the sequence 

— delete{i): delete the record at position i in the sequence and 

— index{i): return the ith record in the sequence. 

Dynamic arrays are useful data structures in efficiently implementing the 
data types such as the Vector class in Java and G-l— k. The dynamic array problem 
was called the List Representation problem by Fredman and Saks, who gave a 
cell probe lower bound of f7(lgn/lglgn) time for this problem, and also showed 
that update time is needed to support constant-time queries. For this 

problem, Goodrich and Kloss ^ obtained a structure that supports insert and 
delete operations in Oljf) amortized time while supporting the index operation 
in 0{l/e) worst case time. This structure uses 0(n^“*^) words of extra space, 
(besides the space required to store the n elements of the array) for any fixed e, 
0 < e < 1. Here n is the size of the current sequence. 

We first observe that the structure of Goodrich and Kloss can be viewed as a 
version of the well-known implicit data structure: the ‘rotated list’ 0. Using this 
connection, we observe that the structure of Goodrich and Kloss can be made 
to take worst case time for updates while maintaining the same storage 

(o(n) additional words) and 0(1) worst case time for the index operation. Then 
using this structure in small blocks, we obtain a dynamic array structure that 
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supports insert, delete and index operations in 0(lgn/lglgn) amortized time 
using o{n) bits of extra space. Due to the lower bound result of Fredman and 
Saks, both our results above are on optimal points of the query time-update time 
trade-off while using optimal (within lower order term) amount of extra space. 

We should also point out that the resizable arrays of Brodnik et al. [H can be 
used to support inserting and deleting elements at either ends of an array and 
accessing the ith element in the array, all in constant time. The data structure 
uses 0{^/n) words of extra space if n is the current size of the array. Brodnik et 
al. do not support insertion into the middle of the array. 

For the dynamic array problem, we assume a memory model in which the 
system returns a pointer to the beginning of a block of requested size. Le. any 
element in a block of memory can be accessed in constant time given the block 
pointer and an integer index into the block. This is the same model used by 
Brodnik et al. [IJ. We count the time as the number of word operations, and 
space as the number of bits used to store the data structure. To simplify notation, 
we ignore rounding as it does not affect our asymptotic analysis. 

In Section 0 we describe our space efficient structures for the partial sum 
problem. In Section 0 we look at the special case of the partial sum problem 
when the given elements are bits, and give the details of a structure that supports 
full tradeoff between queries {select and rank) and update {flip). Section 0 
addresses the problem of supporting the dynamic array operations. We conclude 
with some open problems in Section 0 



2 Partial Sums 

2.1 Searchable Partial Sums 

In this section we describe a structure for the searchable partial sums problem 
that supports all operations in 0{lgn/ Ig Ig n) time. The space used is kn+o{kn) 
bits. We begin by solving the problem on a small set of integers in 0(1) time, 
by adapting an idea of Dietz P). 

Lemma 1. On a RAM with a word size of w bits, we can solve the searchable 
partial sum problem on a sequence of m = numbers, for any fixed 0 < e < 1, 
with item size k < w, in 0(1) worst-case time and using 0{mw) bits of space. 
The data structure requires a precomputed table of size 0(2*^ “’) for any fixed 
e' > 0. 

Proof. Let A[l], . . . , A\m\ denote the sequence of elements for which the partial 
sums are to be calculated. We store another array B\T\, ... , B\m\ which contains 
the partial sums of A, i.e. B\i] = cannot hope to maintain B 

under the update operation, we use Dietz’s idea of letting B get slightly ‘out of 
date’. More precisely, B is not changed after each update; instead, after every m 
updates B will be refreshed, or brought up to date. Since the cost of refreshing 
is 0{m), the amortized cost is 0(1) per update. 
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To answer queries, we maintain an array C[l], . . . , C[m] in addition to A and 
B. C is set to all zeros when B is refreshed. Otherwise, when an update changes 
A[i] by S, we set C[i] ^ C[i\+5. Since \C[i] \ < mSmax = always, the entire 
array C occupies 0{mlgw) bits, which is less than e'w bits for sufficiently large 
w. As observed by Dietz, sum(i) can be computed by adding to B[i] a corrective 
term obtained by using C to index into a pre-computed table. 

We now show how to perform select in 0(1) time. For this, we use the Q-heap 
structure given by Fredman and Willard ||| , which solves the following dynamic 
predecessor problem: 

Theorem 1. 0/ For any 0 < M < 2™, given a set of at most (IgM)^/'^ integers 
of 0{w) hits each, one can support the operations insert, delete, predecessor 
and successor operations in constant time where predecessor{x) (successor (x)) 
returns the largest (smallest) element y in the set such that y < x (y > x). The 
data structure requires a precomputed table of size 0{M). 

By choosing M = 2*^ we can do predecessor queries on sets of size m' = 
in 0(1) time, using a table of size 0{M). By using this data structure in 
a tree with branching factor to', we can support 0(l)-time operations on sets of 
size TO as well. We store the elements of B in the Q-heap data structure. Note 
that changes to the Q-heap caused by refreshing have 0(1) amortised cost. 

If the array B were up-to-date, then we can answer select queries in 0(1) 
time by finding the successor of j in the Q-heap. However, again we face the 
problem that B may not be up-to-date. To overcome this, we let 0[1], . . . , D\m\ 
be an array where D[i] = ToiTi{A[i],m5rnax}- As with O, D is also stored in a 
single word, and is changed in 0(1) time (using either table lookup or bitwise 
operations) whenever A is changed by an update. To implement select(j), we first 
consult the Q-heap data structure to determine an index t such that B[t — 1] < 
j < B[t], By calculating sum(t — 1) and sum(t) in 0(1) time we determine 
whether t is the correct answer. In general, the correct answer would be an index 
t' t\ assume for specificity that t' > t. Note that t' is the smallest integer such 

that A[t-|-1]-| \-A[t'] > j — sum{t). Since j < B[t] and B[t] — sum{t) < mSmax, 

it follows that j — sum(t) < mSmax- By the definition of D, it also follows that 
t' is the smallest integer such that D[t -I- 1] -I- ... -I- D[t'] > j — sum{t). Given D, 
j — sumft) and t, one can look up a table to calculate t' in 0(1) time. A similar 
procedure is followed if t' < t. 

Finally, we note that amortization can be eliminated by Dietz’s incremental 
refreshing approach □ 

For larger inputs, we choose a parameter to = (Ign)*^, for some positive 
constant e < 1. We create a complete m-ary tree, the leaves of which correspond 
to the entries of the input sequence A. We define the weight of a node (leaf or 
internal) as the sum of the sequence elements under it. At each internal node 
we store the weights of its children in the data structure of Lemma Q The 
tree has 0{nfm) internal nodes, each occupying 0(m) words and supporting all 
operations in constant time. From this we get: 
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Lemma 2. There is a data structure for the searchable partial sums problem 
that supports all operations in 0(lgn/lglgn), and requires 0{n) words of space. 

We now modify this structure reducing the space complexity of the structure 
to kn + o{kn) bits. For this, we need the following: 

Lemma 3. For any parameter 6 > 4, there is a data structure for the searchable 
partial sums problem that supports update in 0(logf, n) time and sum and search 
in 0(61ogjn) time. The space used is kn + 0{{{k + lgb) ■ n)/b) bits. 

Proof. We construct a complete 6-ary tree over the elements of the input se- 
quence A. At each internal node we store the sum of the elements of A under 
it. Clearly update takes time proportional to the height of the tree, and sum 
and select can be implemented within the claimed bounds by traversing the tree 
from the root to a leaf, looking at all the children of a node at each level. The 
space bounds follow after a straightforward calculation. □ 

We take the input and divide it into groups of numbers of size (Ign)^ each. 
The groups are represented internally using Lemma 01 with 6 = (Ign)^/^. This 
requires kn+o{kn) bits, and all operations within a group take 0((lgn)^/^) time, 
which is negligible. The n/(lgn)^ group sums are stored in the data structure 
of LemmaQ which requires o(n) bits now. The precomputed tables (required in 
Lemma also require o(n) bits. Thus we have: 

Theorem 2. There is a data structure for the searchable partial sums problem 
that supports all operations in 0(lg n/ Ig Ig n) worst-case time and uses kn-\-o{kn) 
bits of space. 

2.2 Trade-Offs for Partial Sums 

We now observe that one can trade off query and update times for the partial 
sums problem, and show that for any parameter 2 < 6 < n, we can support 
sum in 0(logf,n) and update in 0(61ogf,n) time, while still ensuring that the 
data structure is space-efficient. As these bounds are subsumed by Theorem 0 
for 6 < (Ign)^, we will assume that 6 > (Ign)^ in what follows. 

We construct a complete tree with branching factor 6, with the given se- 
quence of n elements at the leaves. Clearly this tree has height h = logj,n. At 
each internal node, we store the weight of that node, i.e. the sum of the leaves 
descended from it, and also store an array containing the partial sums of the 
weights of all its children. By using the obvious 0(6) time algorithm, the par- 
tial sum array at an internal node is kept up-to-date after each update. This 
gives running times of 0(61g^n) and O(lgjn) for update and sum respectively. 
Unfortunately, the space used to store this ‘simple’ structure is 0{kn) bits. 

To get around this, we use one of two methods, depending on the value 
of fc. If fc > (Ig n) ' then we divide the input values into groups of size Ig n. 
Within a group, we do not store the A[i]’s explicitly, but store only their partial 
sums. The sums of elements in each of the n/lgn groups are stored in the 
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simple structure above, but the space required by that data structure is now 
0{{k + lglgn)n/lgn) (as the size of each sum could be fc + Iglgn) which is 
o{kn) bits. The space required by each group is lgn(fc + Iglgn) bits; this sums 
up to kn + nlglgn = kn + o{kn) bits overall. Clearly the asymptotic complexity 
of update and sum are not affected by this change. 

If k < (Ign)^/^ then we divide the given sequence of elements into groups 
of elgnjk each. Again, group sums are stored in the simple structure, which 
requires 0{kn{k+lg Ig n) / Ig n) = o{kn) bits. Noting that an entire group requires 
elgn bits, we answer sum queries within a group by table lookup. 

Finally, noting that given any parameter b > (Ign)^, we can reduce the 
branching factor from 6 to h/lgn without affecting the complexity of sum; how- 
ever, update would now take 0{b) steps. Combining this with Theorem |2| we 
have: 

Theorem 3. For any parameter Ign/ Iglgn < b < n, there is a data structure 
for the partial sums problem that supports sum in O(lg^n) time and update in 
0{b) time, and uses kn + o{kn) bits of space. 



Remark 1. Note that Lemma 0 combined with Theorem 13 also gives a trade- 
off whereby update takes 0(logf,n) and sum takes 0(b) time, for any b > 
Ign/ Iglgn. 

3 Dynamic Bit Vector 

The dynamic bit vector problem operation is a special case of the searchable 
partial sum problem. The following corollary follows from Theorem 0 

Corollary 1. Given a bit vector of length n, we can support the rank, select 
and flip operations in 0(lgn/lglgn) time using o(n) bits of space in addition 
to the bit vector. 

Similarly, Theorem 0 immediately implies the following result (the only thing to 
observe is that of the two cases in Theorem 0 we apply the one that stores the 
input sequence explicitly): 

Corollary 2. For any parameter Ign/ Iglgn < b < n, there is a data structure 
for the dynamic hit vector problem that supports rank in O(lgjn) time and flip 
in 0{b) time, using o(n) bits of space in addition to the hit vector. 



3.1 Trade-Off between query and update Times 

In this section we show that the trade-off’s between sum and update for par- 
tial sums established in Section O also hold between select and update for 
the special case of the dynamic bit vector problem. We first note the following 
proposition. 
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Lemma 4. The operations select and flip ean be supported in 0(1) time on a 
bit-veetor of size N = on a RAM with word size O(lgn), using a fixed 

pre-computed table of size 0{n'^) bits for some constant e < 1. The space required 
is o{N) bits in addition to the pre-computed table and the bit-vector itself. 

Proof. Simply store the values in a balanced tree with branching factor -^Ig n, 
and stop the tree at the level when the number of leaves at the subtree rooted 
at the nodes is about (lgn)/2. With each internal node, we keep the searchable 
structure of Lemma D At the leaf level, we will use a precomputed table to 
support flip and select in constant time. 

Since the height of the tree is a constant, select and flip can be supported in 
constant time. The space used is o{N) besides the 0{n^) bits required for the 
precomputed table. □ 

We now show how to support select in 0(lg{, n) time if flip takes 0(6) time, 
for any parameter (Ign)^ < b < n. We divide the bit vector into superblocks 
of size (lgn)"‘. With each superblock we store the number of ones in it. The 
sequence of superblock counts is stored in the data structure of Theorem 0 with 
the same value of 6. This enables us, O(lgjn) time, to look up the number of 
ones to the left of any given superblock. We store each of the superblocks using 
the structure of Lemma 0 The space required is o(n) bits. 

In addition, we divide the ones in the bit vector into groups of 0((lgn)^) 
successive ones each. A group’s size varies between 0.5(lgn)^ and 2(lgn)^. We 
construct a weight-balanced B-tree (WBB tree) in the sense of Dietz each 
leaf of which corresponds to a group leader. Roughly speaking, the branching 
of this tree is 6. Some small modifications need to be made to Dietz’s balance 
conditions: for example, the weight of an internal node needs to be redefined to 
be the sum of the sizes of the groups under it (we omit details of the WBB tree 
in this abstract). Given an integer j, using Dietz’s ideas we can locate the group 
in which the j-th one lies in 0(lgj n) time, and support changes due to flips in 
0(6) amortised time. 

With each group we store the index of the superblock in which the group’s 
leader lies. Equally, with each super block, we store all group leaders which lie in 
that superblock. 

The span of a group is the index of the superblock in which the next group’s 
leader lies minus the index of the superblock in which its group leader lies. If 
the span of a group is > 2 we say it is sparse. With each sparse group, we store 
an array which gives the location of each I in the group. Since the maximum 
size of this array is 0((lgn)^), using either the implementation of Goodrich 
and Kloss or the implementation of Theorem 0 we get a dynamic array which 
allows insertions and deletions in O(lgn) time and accesses in 0(1) time. This 
requires 0((lgn)^) bits per sparse group, but there can only be 0(n/(lgn)^) 
sparse groups, so the total space used here is 0(n/ Ign). 

To execute select{j) we first locate the group in which the j-th one lies in 
0(lg^ n) time, spending 0(1) time at each level of the tree. If this group is sparse 
then we look up the array associated with the group in 0(1) time. Otherwise, 
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we look in the superblock in which the group leader lies, as well as the adjacent 
superblock, where we can answer the query in 0(1) time each by Lemma 0 
To set bit j to 1, we first locate the group to which position j would belong. 
This is easy given the space bounds: recall that via rank and select one can 
move from the i + 1st one bit to the ith one bit in 0(1) time. As we move along 
this “virtual linked list” of 1 bits, we check to see if we have reached a group 
leader (by looking to see if it is listed among the group leaders in the current 
superblock). Having thus located the group leader in poly-log (negligible) time, 
we then either insert into the appropriate array (if the group is sparse) and also 
into the data structure associated with the superblock. Group splits and merges 
are handled straightforwardly. Again for b > Ig"^ n, we can actually make the 
branching factor to be b/ Ig n. For other values of b, using Corollaries Q] and El 
we have: 

Theorem 4. Given a bit vector of length n, we can support the rank and select 
operations in 0(lgf,n) time and flip in 0{b) amortised time for any parameter 
b, b> Ign/lglgn using o(n) bits of extra space. 



Remark 2. Note that one can also get a trade-off similar to that of Remark ^ 
whereby flip takes 0(logj,n) and rank/ select takes 0(b) time, for any b > 
Ign/lglgn, using o(n) bits of extra space. 

4 Dynamic Arrays 

We look at the problem of maintaining an array structure under the operations 
of insertion, deletion and indexing. Goodrich and Kloss 0 have given a structure 
that supports (arbitrary) insert and delete operations in 0(nf^) amortized time 
and index operation in 0(1) worst case time using o(n) bits of extra space to 
store a sequence of n elements. Here, first we describe a structure that essen- 
tially achieves the same bounds above (except that we can now support updates 
in 0{n‘^) worst case time) using a well known implicit data structure called 
recursively rotated list 0 . Using this as a basic block, we will give a structure 
that supports all the dynamic array operations in Oflgn/lg Ig n) amortized time 
using o(n) bits of extra space. 

We assume a memory model in which the system returns a pointer to the 
beginning of a block of requested size and hence any element in a block of memory 
can be accessed in constant time given its index within the block and the block 
pointer. This is the same model used in the resizable array of Brodnik et al. p. 

Rotated lists were discovered to support dictionary operations implicitly, on 
a totally ordered set. A (1-level) rotated list is an arbitrary cyclic shift of the 
sorted order of the given list. We can search for an element in a rotated list on n 
elements in O(lgn) time by a modified binary search, though updates (replacing 
one value with another) can take 0{n) time. However, replacing the largest 
(smallest) element with an element smaller (larger) than the smallest (largest) 
can be done in 0(1) time if we know the position of the smallest element in the 
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list. A 2-level rotated list consists of elements stored in an array divided into 
blocks where the Ath block is a rotated list of i elements. It is easy to see that 
such a structure containing n elements has r = 0(\/n) blocks. In addition, all 
the elements of block i are less than every element of block i -|- 1, for 1 < i < r. 

This structure supports searches in O(lgn) time and updates in 0{y/n) time, 
if we also explicitly store the position of the smallest element in each block 
(otherwise the updates take 0{y/n\gn) time). This is easily generalized to an l- 
level rotated list where searches take 0(2^ Ign) time and updates take 0(2*n^/*) 
time. See jZ] for details. 

To use this structure to implement a dynamic array, we do the following. We 
simply store the elements of the array in a rotated list based on their order of 
insertions. We also keep the position of the first element in each recursive block. 
Since we know the size of each block, index(i) operation just takes 0{l) time 
in an /-level rotated list implementation of a dynamic array. Similarly insert- 
ing/deleting at position i can be done in a similar fashion as in a rotated list 
taking 0(2*n^/*) time. Thus we have. 

Theorem 5. A dynamic array having n elements can be implemented using an 
l-level rotated list such that queries ean be supported in 0{l) time and updates 
in 0(2*n^/*) time using an extra space of pointers. 

Choosing / to be a small constant, we get 

Corollary 3. A dynamic array containing n elements can be implemented to 
support queries in constant time, and updates in 0{n^) time using point- 

ers, where e is any fixed positive constant. 

Using this structure, we now describe a structure that supports all the dy- 
namic array operations in 0(lgn/lglgn) amortized time using o(n) bits of extra 
space. 

We divide the given list of length n into sub lists of length 6>(lg'^ n) . In partic- 
ular, each sub-list will be of length between ^ Ig^ n and 2 Ig"* n. (We implement 
these leaves using the dynamic array structure of Theorem 0) We construct a 
weight-balanced B-tree (WBB tree) in the sense of Dietz each leaf of which 
corresponds to a sub-list. Some small modifications need to be made to Dietz’s 
balance conditions: for example, the weight of an internal node needs to be re- 
defined to be the sum of the sizes of the sub-lists under it. The space required 
to store this tree is o(n) bits. Supporting insert and delete in 0(lgn/lglgn) 
amortized time is done as in 0. To find the jth element of the list, we first find 
the block in which the jth element occurs, using the select(j) operation on the 
WBB tree and then find the required element in that block. Thus we have: 

Theorem 6. A dynamic array can be implemented using o(n) bits of extra spaee 
besides the space used to store the n reeords, in which all the operations can be 
supported in 0{lgn/ Iglgn) amortized time, where n is the current size of the 
array. 



436 



R. Raman, V. Raman, and S.S. Rao 



5 Conclusions 

We have given a succinct searchable partial sum data structure where sum, select 
and update can be supported in optimal 0(lgn/lglgn) time. We have also given 
structures in which sum can be supported in (lgf,n.) time and update in 0{b) 
time for any b > Ign/lglgn. These tradeoffs also hold between select /rank and 
update (flip) for the dynamic bit vector problem. These structures use at most 
o(n) extra words than necessary. 

For the dynamic array, we have given two structures, both using o(n) bits of 
extra space where n is the number of elements in the array: one supports lookup 
in constant worst case time and updates in 0(n'^) worst-case time, and the other 
supports all operations in 0(lgn/lglgn) amortized time. 

The following problems remain open: 

1. In the searchable partial sums problem, we were able to support select in 
O(lgjn) time and update in 0{b) time using o{kn) bits for the special case 
of A: = 1. When is this trade-off achievable in general? 

2. For the dynamic array problem, are there tradeoffs (both upper and lower 
bounds) similar to those in the partial sum problem between query and 
update operations? In particular is there a structure where updates can be 
made in 0(1) time and access in 0(n'^) time? 

3. Another related problem looked at by Dietz, and Fredman and Saks is the 
List indexing problem which is like the dynamic array problem, but adds the 
operation position(x), which gives the position of item x in the sequence, and 
also modifies insert to insert a new element after an existing one. Dietz has 
given a structure for this problem that takes 0(n) extra words and supports 
all the operations in the optimal 0(lg n/ Ig Ig n) time. It is not clear that one 
can reduce the space requirement to just o(n) extra words, and still support 
the operations in optimal time. 
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Abstract. A polygon P admits a walk from a boundary point s to 
another boundary point t if two guards can simultaneously walk along 
the two boundary chains of P from s to t such that they are always 
visible to each other. A walk is called a straight walk if no backtracking 
is required during the walk. A straight walk is discrete if only one guard 
is allowed to move at a time, while the other guard waits at a vertex. 
We present simple, optimal 0(n) time algorithms to determine all pairs 
of points of P which admit walks, straight walks and discrete straight 
walks. The chief merits of the algorithms are that these require simple 
data structures and do not assume a triangulation of P. Furthermore, 
the previous algorithms for the straight walk and the discrete straight 
walk versions ran in 0(n log n) time even after assuming a triangulation. 



1 Introduction 

As a topological entity a simple polygon is merely a closed non-intersecting curve; 
as a metric entity its boundary structure can be extremely complex. Various 
computational problems related to simple polygons are the result of efforts to 
classify polygons with respect to their boundary structures. Among these are 
visibilty problems in their different incarnations, art-gallery or guard problems 
which are fundamental to the understanding of polygon visibility. Many of 
these problems can be solved efficiently, if one assumes that a triangulation of 
the polygon is available. However, any algorithm that requires the notoriously 
complex linear-time triangulation algorithm of Chazelle 0 is simply impractical. 
In this paper, we show that it is possible to find efficient algorithms that avoid 
this requirement. In particular, we show how to solve various versions of the 
2-guard walkability problem of Icking and Klein |H| in optimal linear time. 

The 2-guard problem involves determining whether two guards can move 
along the boundary of a simple polygonal room from a common starting point s 
to a common destination point t, in opposite directions, while remaining in sight 
of each other at all times. If they can, then P is said to be 2-guard walkable 
(or, simply, walkable) with respect to (s,t). A polygon is said to be walkable if 
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such a walk exists for some pair (s,t). Icking and Klein 0 introduced this class 
of problems, and showed that it is possible to test in 0(n log n) time whether 
P is walkable with respect to a pair of points (s,t). Subsequently, a linear time 
solution was presented by Heffernan |7|, assuming that a triangulation of the 
polygon is available. If a triangulation is not assumed the bound is 0(n log n). 

A stronger notion than walkability is that of straight walkability. A walk- 
able polygon is said to be straight walkable if the guards are able to complete 
the walk, again under the condition of constant mutual visibility, without hav- 
ing to backtrack during their walk to ensure this. Tseng et al. im presented an 
0(n log n) algorithm to determine whether P is straight walkable, and within 
the same time bound showed how to generate all such pairs. An even stronger 
notion is that of discrete straight walkability p. A straight walkable polygon 
is said to be discretely straight walkable if only one of the guards is allowed to 
walk at a time, while the other remains stationary at a vertex. Testing discrete 
straight walkability and computing all discrete straight walkable pairs of points 
can be performed in 0(n log n) time P, 

The main results of this paper are surprisingly simple, optimal linear-time 
algorithms that determine all pairs of points of P with respect to which the 
polygon P is: (i) walkable, (ii) straight walkable, and (iii) discretely straight 
walkable. While we improve the best known results for (discrete) straight walk- 
ability, the surprising element is that all our algorithms use only simple sweeps 
and elementary data structures. 



2 Preliminaries 

Let P be a simple polygon with n vertices. The open (closed) clockwise chain 
of P from u to u is denoted by P{u,v) (P[u, u]). Half-open chains will be de- 
noted by P[u,v) or P(u,u], depending on which end is open. For points p and 
q that belong to a boundary chain of P with distinct end-points, if p precedes 
q in counterclockwise (clockwise) order, we denote this by p <ccw q {p ^cw q)- 
Every reflex vertex u (of P) determines two components, a clockwise component 
Pcw{u), consisting of the chain P[u,Ucw] and the bounding chord uucw', and 
a counterclockwise component Pccw(u) a counterclockwise component Pccw(u), 
consisting of the chain P[uccw,u] and the bounding chord uuccw (see Fig. [Q). 
The bounding chords uUcw and uUccw are called respectively the clockwise and 
the counterclockwise chords of u. A component is called non-redundant if it does 
not contain any other component. 

A walk is said to be in an s-deadlock (t-deadlock) if two guards, both starting 
from s (t), have reached points from where neither can move forward without 
losing sight of each other. Points u and v illustrate this situation in Fig.[Dfor two 
guards starting from s. Various cases of s-deadlock and t-deadlock are discussed 
in HH. A walk with respect to a pair of points, (s,t), is deadlock free iff it is 
both s-deadlock free and t-deadlock free. Equivalently, a walk with respect to 
the pair (s,t) is s-deadlock (t-deadlock) free iff, for a walk originating at s (t). 
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the moment a guard enters a component, say Q, the other guard must be already 
inside Q. We will use this latter formulation in designing our algorithms. 




Fig. 1. Components Pcw{u) and Pccw(v) creating s-deadlock 



A polygon P is LR-visible jSj if there exist a pair of points s and t on P such 
that P{s, t) and P{t, s) are weakly-visible from each other. It was shown 0 that 
P is 2-walkable iff it is LR-visible and has a deadlock-free region. Bhattacharya 
and Ghosh j2] have given a simple linear time algorithm to determine if P is LR- 
visible. The algorithm also allows one to compute the shortest path tree of P from 
an arbitrary vertex. Thus if P is LR-visible, all its non-redundant components 
can be determined in linear time gI3. Using these components, it is easy to 
compute in 0{n) time all possible pairs of boundary chains (Ai, Bi), i = 0, . . . ,m, 
such that for any s G Ai and any t G Bi, P is LR-visible with respect to the 
pair of boundary points (s, t). The above pairs of chains can be determined with 
either all A^’s disjoint or with all the Bi’s disjoint. The disjoint ones are reported 
in counterclockwise order. This latter output is the input to our algorithms in 
this paper. 

Additional conditions are necessary for a 2-walkable polygon to be (discrete) 
straight walkable. A polygon is straight walkable (discrete straight walkable) if 
there exists a pair of points s and t on P such that P is 2-walkable with respect 
to s and t, and the chains P{s,t) and P{t, s) do not have a configuration called 
a “wedge” ( “semi- wedge” ) . 

3 Two-Walkability 

3.1 Overview 

From the pairs of chains {Ai, Bi),i = 0, . . . , m that describe the LR-visible pairs 
of points of P, we choose a set of point pairs {si,ti) G Ai x Bi,i = 0, . . . ,m. 
The points satisfy the property that sq <ccw si <ccw ■ ■ ■ <ccw Sm <ccw to <ccw 



Optimal Algorithms for Two-Guard Walkability of Simple Polygons 441 



ti <ccw ■ ■ ■ <ccw tm <ccw sq- We assume that the Sj’s and the tiS are not reflex 
vertices. We shall show in a later section how to choose such a set of point-pairs. 

The main idea is to And a maximal deadlock-free region around each Si and ti 
by using a clockwise sweep by a clockwise guard and a counterclockwise sweep by 
a counterclockwise guard. This is achieved by a pair of “cooperating processes” 
termed as a walk where the two guards traverse the boundary of the polygon 
without losing sight of each other. In the i-th iteration, walki can be described 
as follows: the counterclockwise guard, starting from s,, is targeted to reach 
Si+i, while the clockwise guard, also starting at Sj, is targeted to go as far as 
possible without crossing ti. In particular, the clockwise guard begins to walk 
towards U, unless it encounters a reflex vertex vi whose clockwise component 
Pcw(vi) does not contain the current position of the counterclockwise guard. 
Since any advancement of the clockwise guard would result in the loss of co- 
visibility, control passes to the counterclockwise guard who begins to walk from 
its current position in an attempt to enter this component. This walk of the 
counterclockwise guard can terminate in one of three different ways. 

— It reaches its preassigned destination (si+i), in which case walki is termi- 
nated after stacking some related parameters; 

— It succeeds in entering Pcw(vi), in which case it waits, while the clockwise 
guard resumes advancing towards tf, 

— It encounters a reflex vertex Vr and Pccw{vr) does not contain vi in which 
case we report a deadlock. We now let the counterclockwise guard reach its 
target Sj+i, and terminate walki. 

After we have generated all the walks, walki, i = 0, . . . ,m, the walk parameters 
left on the stack correspond to (si,ti) pairs that permit s-deadlock free walks. 
We shall prove this claim in the next section. 



3.2 Details of the Algorithm 

To implement walki described above, we need to be able to answer the following 
queries efficiently. Let the I and r be the current positions of the clockwise and 
the counterclockwise guards respectively. If I (r) is a reflex vertex, let lew {tccw) 
be the other endpoint of the bounding chords of Pcw{h) (Pccwi'i'i))- 

Visibility: Are l,r G P co- visible? 

Clockwise containment: Is r in Pcw{l)"l 
Counterclockwise containment: Is I in Pccw{r)l 
Crossing-over event: Is r <ccw lcw"l 

Since (si,ti) is an LR-visible pair, every component of P must contain one of 
Si or ti. Below, we will often appeal to this fact. In our description below, we 
differentiate between the cases: z = 0 and z > 0. 
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Fig. 2. lew can lie in P{ro, sq] or on P[to, ro). 



Implementing the Queries for walko. 

Visibility Query: As seen below, when the guards are at location Iq and ro 
during the walk process, we need to answer the visibility query. It may be noted 
that this is not a general visibility query since the two guards are monotonically 
traversing from point sq towards their destinations (si and to) - In order to answer 
this query efficiently (in constant amortized time), we preprocess P with respect 
to the pair (so, to)- The preprocessing involves computing the “restricted shortest 
paths” from so &nd to to every vertex on P. A restricted shortest path from so 
to vertex u € P(so;^o) (or, v € P{to,So)) is the shortest path between Sq and u 
(v) ignoring the effect of P{v,so) (P{u,so))- This is a standard procedure that 
has been used in several papers (refer to |4l2j 1 for exactly the same reason (i.e., 
answer visibility queries). 

Clockwise Containment Query: Let IqIcw denote the bounding chord of the com- 
ponent Pcw{lo)- We determine, in constant time, if Qw G P{to, sq). Note that this 
is easily determined by inspecting the last edges on the shortest paths from so 
and to to Iq, as these paths do not cross. If not, then Pcw{lo) contains cq. If yes, 
as P(ro, Iq) does not intersect l^rQ, we test if P(to> ?’o) intersects this segment. If 
it does not, we easily determine if Qw G P{t’q,sq). If it does, we resolve between 
the two cases shown in Fig. El as follows. We traverse the boundary chain from 
ro to to to find the first vertex x from which lo is visible. Clearly, x is reflex. 
Since P is LR-visible with respect to the pair (so,to)? this vertex also has the 
property that all counterclockwise components of the reflex vertices of P[a;,ro) 
contain Iq. 

To find X, the counterclockwise guard walks towards to, stopping at si (= 
CCW.limit) or x, whichever comes earlier. This stop becomes the new ro- In 
either case, the property of nonintersection of the segment loro and the chain 
P{ro,lo) is preserved. 

Counterclockwise Containment Query: Let iWccw denote the counterclock- 
wise chord of the component Pccw{to)- We determine, in constant time, if 
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Tccw G P{so^ti)). If not, it follows from the LR- visibility of P that the component 
Pccwifo) contains sq and, a fortiori, Iq. If yes we check, in constant time again, if 
the ray roVccw is directed into the pocket determined by the chain P[ro,^o] and 
the chord tq/o- If yss then the component PccwiTo) contains Iq, otherwise not. 

Crossing-over Event Query: We must determine if the clockwise ray shot from 
Iq hits vq. In other words, the counterclockwise guard must find lew to effect the 
crossover. We know that P[ro,Zo] does not intersect the interior of IqVq. Also, 
P[Zo,to] does not intersect the interior of IqVq. Hence tq = Qw iff a ray from vq 
in the direction of Iq hits the chain P[so,to]- 

We summarize the above discussions with the following claim: 

Lemma 1. We can check in (amortized) constant time if vq G Pcw{lo), lo G 
Peewi^Q), and if the crossing-over event has occurred, i.e., when the counter- 
clockwise guard reaches lew- 

Implementing the Queries for walki. We will now discuss how to answer 
the same queries for an arbitrary LR-pair (si,ti), where Si € P(tQ,so) and ti G 
P{sq, tg)- Let li and denote the positions of the clockwise and counterclockwise 
guards respectively. 

Clockwise Containment Query: The following cases arise. 

Case 1: h G P[sq, ti] and G P[to, s^]. This problem is the same as the clockwise 
containment problem for (so,to)- 

Case 2: k G P[sQ,ti] and rt G P[ti,tQ]. In this case Pcw{k) contains since Qw 
cannot lie on P[li,to]. 

Case 3: h G P[si,so] and G P[ti,tQ], Again, this is very similar to Case 1. 

Case 4: k G P[si,so] and r* G P[tQ,Si\. Clearly, if Qw £ H[so,<o], Pewih) does 

not contain r^. Suppose this is not the case. As P is LR- visible with respect to 
(sqj fo)) lew ^ P[k, So] - We know that P{ri, If) does not intersect Uri. Again, since 
P is LR-visible with respect to (so,to)) P[toji^i) cannot intersect kri. Hence, we 
can easily determine whether Qw £ P[ri,li] or Qw £ P[to,ri\. 

Counterclockwise Containment Query: Again, this query is resolved by examin- 
ing the following cases, given that Pcw{h) does not contain r^. 

Case 1: k G P[sQ,ti] and G P[tQ,Si\. This is similar to the case for (so,to)- 
Case 2: k G P[sQ,ti] and n G P[ti,tQ]. In this case Veew must lie on P[to,so], 
since P is LR-visible with respect to (so,fo)- Therefore, Peewiff) contains k. 
Case 3: h G P[si,so] and G P[ti,tQ], Again, this is very similar to Case 1. 

Case 4: k G P[si,so] and r* G P[tQ,s^]. Clearly, if reew £ -P[so,to] then Peewirf) 

does not contain k. Suppose reew £ -P[tojSo]- Suppose the ray from ri towards 
reew is directed towards the interior of the pocket determined by P[ri,lf\ and 
liVi- In this case, reew 4- P[h-:SQ] as shown in Fig. 0 otherwise h is not visible 
from P[sq, to], contradicting the LR-visibility of P with respect to (sq, to)- If the 
ray is directed away from the pocket, then Veew £ P[h,SQ] and so Peewirf) does 
not contain li. 
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Fig. 3. Tcctn cannot lie on P[li, so] 



Crossing-over Event Query: We detetmine if the ray shot from U hits r^. 

Case 1: S P[sq, U] and S P[to, s^j. This problem is the same as the crossing- 

over event problem for (sojto)- 

Case 2: li G P[so,ti] and G P[ti,to\- This case cannot arise as lew S P[to,so]- 

Case 3: k G P[si, sq] and G P[ti, to]- This problem is the same as the crossing- 
over event problem for (so,to)- 

Case 4: li G P[si,So] and G P[tQ,Si\. First we check if Qw & -P[so 7 ^oj- If so, 
G ^ lew- Suppose lew G -P[tojSo]- Since P is LR-visible with respect to (soito); 
the first intersection of P[to,li] in counterclockwise order with liCw is lew- Since 
P{vi, li) does not intersect liVi in the interior, in our case ri = lew- We summarize 
the above discussions as follows: 

Lemma 2. We can check in (amortized) constant time if Vi G Pew{h), h G 
Peewifi), O'T^d if the crossing-over event has occurred, i.e., when Vi = lew- 

Deadlock-Test: The process walki is described a slightly more general form 
in the procedure Deadlock-Test as shown in Fig. 0 It accepts four ar- 
guments: the start points CW. start and CCW. start of the clockwise and 
counterclockwise guards, and their respective destinations, CW.limit and 
CCW.limit. The algorithm returns the maximal s-deadlock free chain P[r,l] 
in P[CCW.limit, CW.limit]. 

Several observations are in order. We note that if and h are the current 
positions of the two guards, the boundary chain P[ri, k] does not obstruct their 
mutual visibility. This is easily seen from the invariance of the following two 
assertions during the movement of the guards from their start positions to the 
current positions. 

— All the clockwise components generated by the reflex vertices on P[si,li], 
with the possible exception of U, contain r^. 

— All the counterclockwise components generated by the reflex vertices on 
P[ri,Si\, with the possible exception of Vi, contain h. 
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Algorithm DEADLOCK-TEST(C'lT.stort, CCW.start, CW.limit, CCW.limii) 

Output A pair of points {l,r). If r 7 ^ CCW.limit, P\r,l] has an s-deadlock. 

begin 

I <— CW. start; r <— CCW.start; 

{ I (r) represents cnrrent position of clockwise (counterclockwise) guard } 

{The clockwise guard moves} 

Loopl: while {I is not a reflex vertex and I <c™ CW.limit) do 
I <— next{l); 

if CW.limit <=cw I then 

Output (CW.limit, COW. limit) 

{ P[C CW.limit, CW.limit] has no s-deadlock.}; 

Return 

else (Hs a reflex vertex.} 

if Pcw{l) contains r then 
I t— next{l); 
go to Loopl; 
else go to Loop2 

{ Hs a reflex vertex and r is not contained in Pcmi})} 

(The counterclockwise guard moves} 

Loop2: while (r is not a reflex vertex and CCW.limit <cw r and Pw <c™ r) do 
r <— next{r); 
if r <=cw lew then 

{ counterclockwise guard has entered the clockwise component at 1 ;} 

{ hand over control to clockwise guard } 
r = lew 
I <— next(l); 
go to Loopl; 
else if (r = CCW.limit } 

Output (1, CCW.limit) 

Return 

else {r is reflex } 

if Peew{r) contains I then 
{ counterclockwise guard continues in Loop 2} 
r next{r); 
go to Loop2; 
else (deadlock situation} 

Output {I, r) 

Return 

end 



Fig. 4. Algorithm Deadlock-Test 



In summary, we conclude with this theorem: 

Theorem 1. Given a pair of points (sj, U) with respect to which P is LR-visible, 
Deadlock-Test(si, Si, ti, Si+i) returns a pair of points {k,ri);ifri = Si+i, then 
the chain P{ri,li) is deadlock-free, else P{ri,li) is a deadlock region with respect 
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to a walk starting at Si . The cost is linear in the number of vertices contained in 
the chain P{ri,li). 

3.3 Reporting All Deadlock-Free Pairs 

We now show how to combine the results of the calls to procedure Deadlock- 
Test from consecutive iterations. Suppose that after iterations i and i + 1, 
Deadlock-Test indicate that P[ri, U] and P[ri+i,li+i] are deadlock free. Since 
ri = Si+i and U+i <ccw Si+ij the two regions overlap and we can combine 
the results to claim that P[ri^i,li] is deadlock-free. Furthermore, in order to 
avoid either the clockwise guard or the counterclockwise guard from repeatedly 
going over any overlapping portions in different iterations (which will happen if 
h+i ^ccw Si), the walks make use of information stored on a special snapshot- 
stack and skip over these regions during their walks. 

Whenever an iteration reports a deadlock region, then the corresponding 
boundary chain of P is marked as ineligible. Future iterations will skip over this 
portion of the polygon, traversed by the clockwise guard. 

The algorithm All-Points-Pairs implements the walks for all the iterations 
by making calls to Deadlock-Test to determine all deadlock free regions. 

Algorithm All-Points-Pairs 

Output Boundary chains that admit s-deadlock. 

begin 

Set i 0; Sm+i tm] 
while { i < to) do 

CW. start ^ CCW. start ^ sp, CW. limit ^ U; CCW. limit ^ Si-i-i; 

Loop: (l,r) -P- Deadlock-Test(CW. start, CCW. start, CW. limit, CCW. limit) 
If r = CCW. limit then 

PUSH(snapshot-stack,/, r,CW. limit); 
i p- index(CCW. limit); 
else 

Label the chain P{r,l) as ineligible, 
if (snapshot-stack ^ empty) then 
(a,b,c) ^ POP (snapshot-stack); 
if P{b,a) does not contain I then 

CW. start ^ 1; CCW. start ^ r; Go to Loop; 
else 

CW. start ^ a; CCW. start r; CW. limit c; Go to Loop; 

else i P- index(CCW. limit); 

Traverse the ineligible boundary chains and remove all pairs whose associated 
s points lie on these parts. Report the non-deleted pairs as the pairs that allow 
s-deadlock free walk. 

end 

All-Points-Pairs needs to maintain lists of boundary objects not traversed 
by the clockwise guard yet, as well as a list of boundary objects labeled as 
“ineligible” . Both lists are easily maintained as stacks. 
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Since the boundary chains are traversed at most once in clockwise order and 
at most once in counterclockwise order and since the subroutine Deadlock- 
Test outputs chain in time proportional to the size of the chain, we claim 
therefore: 

Lemma 3. All the pairs of (si, ti), z = 0, 1, . . . , m that allow s-deadlock free and 
t-deadlock free walks can be correctly determined in 0(n) time. 

Given the lists produced in Lemmas El it is easy determine the pairs of points 
with respect to which the polygon is 2-walkable. Therefore, we have: 

Theorem 2. Given (si,ti),i = 0,1,..., m and P, it is possible to determine 
in optimal time the pairs of points among {si, ti), z = 0, 1, . . . , m with respect to 
which P is 2-walkable. 

3.4 Reporting All Pairs of Bounding Chains with Respect to which 
P has no s- Deadlock 

IldlSIllI gave methods to determine the pairs of bounding chains {Ai,Bi),i = 
0,1... m of P such that for any arbitrary (s,t) G {Ai,Bi), the polygon P is 
LR- visible. The pairs can be determined with either Ai’s all disjoint or Bfs all 
disjoint. We assume all Ai’s to be disjoint when they are tested for s-deadlock 
and assume all Bfs to be disjoint when they are tested for t-deadlock. 

We describe our method for s-deadlock only. Our objective is to identify the 
parts A' and B' of Ai and Bi respectively such that for any arbitrary (s, t) G 
(A',B-), the polygon P is s-deadlock free. Let us consider the pair {Ai,Bi). In 
order to test {Ai,Bi) for s-deadlock, we replace this pair (Ai,Bi) by a set of 
points {sj,tj),j = 0, 1, . . . where sq, si, . . . belong to Ai and to Ai, ■ • ■ belong to 
Bi such that s-deadlock free chains corresponding to {Ai, Bi) can be determined 
once we know the status of {sj,tj)'s for s-deadlock. 

Let vi,V 2 , . ■ . ,Vk be the reflex vertices on Ai = Pcw(oii, a'). Let vq = ai and 
Vk+i = a{. Let Ai^ = P{vj-i,vj), j = 1,2, . . . Let {sj,tj) be any arbitrary 

pair in {Ai., Bi). We can show that 



Lemma 4. P is s-deadlock free for chain {Ai.,Bi) if and only if P is s-deadlock 
free for the pair {sj,tj). 

From the above lemma we see that we can replace each chain pair by point 
pairs whose number is one more than than the number of reflex vertices in Ai. 
Since A's are disjoint, the total number of points generated is at most n. Also it 
should be noted that s's are all non-reflex points. Thus we have the two theorems 
below. 

Theorem 3. Given LR-visible chain pairs {Ai, Bi), i = 1, . . . ,m, where all Ai ’s 
are disjoint, all s-deadlock- free chain pairs can be determined in 0{n) time. 

Theorem 4. All 2-walkability pairs of an arbitrary polygon P can be determined 
in optimal linear time. 
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4 Straight Walkability 

Our input is a set chain-pairs such that each chain-pair admits 

2-walkability. We outline an algorithm to determine if P is straight walkable 
and, if so, determine the chain-pairs which admit such walkability. 

The concept of a wedge is important for determining whether P is straight 
walkable |H|. A wedge is a chain P[v,u], defined by two reflex vertices u and 
V such that Pccw{u){Pcw(y)) contain v{u) and the clockwise chord urtctu and 
the counterclockwise chord vvccw intersect. A wedge is called non-redundant if 
it does not contain any other wedge or a component. Otherwise, it is called 
redundant. 

Let W be any wedge in P. Tseng at al m3 showed that 

Lemma 5. If P is straight walkable with respect to (s,t) then either s G W or 
t G W. 

Therefore P is not straight walkable if it has at least three or more disjoint 
wedges. To determine if P is straight walkable we proceed along the following 
lines. Let P be 2- walkable with respect to the point pair (s, t). We compute all the 
non-redundant wedges in stages. First, all the non-redundant wedges containing 
s or t are computed. Then we compute all the non-redundant wedges contained 
in P(s, t) and P{t, s), keeping track of the number of disjoint wedges computed 
so far. We stop if three disjoint wedges are detected. We can implement the 
above ideas using some of the techniques developed in Section 3. Hence 

Theorem 5. We can determine all pairs of chains with respect to which P is 
straight walkable in linear time. 



5 Discrete Straight Walkability 

The problem of determining the chain-pairs with respect to which P is discretely 
straight walkable can also be determined in a similar manner. 

The concept of a semi- wedge P| is important here. A semi- wedge is a chain 
P[v, u], defined by two reflex vertices u and v such that Pccw{u){Pcw{v)) contain 
v(u) and the clockwise chord uucw (counterclockwise chord vVccw), while Uccw 
and Vu, belong to the relative interior of some edge. A semi- wedge is called non- 
redundant if it does not contain any other semi-wedge or wedge. Otherwise, it 
is called redundant. 

Analogous to the result by fj, if VF is a semi- wedge of P, it was shown in j2] 
that 

Lemma 6. If P is discretely straight walkable with respect to (s, t) then either 
s G W or t G W. 

This implies that if P is not discretely straight walkable with respect to any 
(s,t) pair if it contains three disjoint non-redundant semi-wedges. 



Optimal Algorithms for Two-Guard Walkability of Simple Polygons 449 



Our algorithm proceeds in a similar manner: determine three disjoint semi- 
wedges with respect to an arbitrary (s, t) pair. Details are provided in Appendix 
C. The claims are summarized in the following theorem: 

Theorem 6. We can determine all pairs of chains with respect to which P is 
discretely straight walkable in linear time. 

6 Conclusions 

In this paper, we present surprisingly simple and efficient (optimal, linear-time) 
algorithms for many 2-guard problems. We believe that this work will help to 
bridge the gap between impractical linear-time algorithms for visibility problems 
and practical efficient implementations. 
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Abstract. This paper investigates the problem of time-optimum move- 
ment planning in d = 2, 3 dimensions for a point robot which has 
bounded control velocity through a set of n polygonal regions of given 
translational flow velocities. This intriguing geometric problem has im- 
mediate applications to macro-scale motion planning for ships, sub- 
marines and airplanes in the presence of significant flows of water or 
air. Also, it is a central motion planning problem for many of the meso- 
scale and micro-scale robots that recently have been constructed, that 
have environments with significant flows that affect their movement. In 
spite of these applications, there is very little literature on this problem, 
and prior work provided neither an upper bound on its computational 
complexity nor even a decision algorithm. It can easily be seen that opti- 
mum path for the d = 2 dimensional version of this problem can consist 
of at least an exponential number of distinct segments through flow re- 
gions. We provide the first known computational complexity hardness 
result for the d = 3 dimensional version of this problem; we show the 
problem is PSPACE hard. We give the first known decision algorithm 
for the d = 2 dimensional problem, but this decision algorithm has very 
high complexity. We also give the first known efficient approximation 
algorithms with bounded error. 



1 Introduction 

1.1 Formulation of the Problem and Motivation 

We assume the problem is given as a polyhedral decomposition of d-space, where 
each region r defined by the polyhedral decomposition has an assigned transla- 
tional flow defined by a vector fr- There is also associated with each region r a 
non-negative real number br giving the maximum Euclidean norm of the control 
velocity that the robot can apply within region r. The robot is considered to be 
a point with a given initial position and also a given final position to be reached 
by the robot. 

At time t = 0, the point robot is at some given initial position point. Within 
each region r, the robot can apply, at each time t > 0 and in any direction, a 
translational control velocity vector v(t) of bounded Euclidean norm |u(t)| < br- 
However, the actual velocity of the robot at time t is given by the sum v{t) -|- fj. 
of it’s control velocity vector v{f) and the translational flow velocity of region 
r, as shown by Figure ^ a. 

* Supported by NSF ITR EIA-0086015, NSF-IRI-9619647, NSF CCR-9725021, SEGR 
Award NSF-llS-01-94604, Office of Naval Research Contract N00014-99-1-0406. 
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(b) Optimum Path is Straight 
Line Segment 



Fig. 1. 



The flow path optimization problem is to find an optimum path of movement 
of the point robot from the initial position to the final position with minimum 
time duration. The flow path decision problem is given a rational number r > 0, 
determine if the flow path optimization problem has time r. 

In an extreme example, where 6^ = 0 at each region r, then the movement of 
the robot is simply a sequence of translations provided by each region’s flow, and 
the problem reduces to the prediction of the path of the robot in the presence 
of overwhelming flow velocities where the robot has no control of movement. In 
an another extreme example, where |/r| = 0 for each region r, then this problem 
reduces to the usual weighted shortest path problem through regions of given 
translational velocities. 

In our investigation of the computational complexity of this problem, we 
assume that the polygonal decomposition of input problem has n regions, and 
that the input problem is specified with a total of bits: in particular, for 

some constant c > 1, we assume that we are given the following within cn bits: 

— the positions of the boundaries of these regions, 

— the initial and final positions of the robot, 

— the flow velocity and bounding velocity of the robot within each region. 

This flow path problem has a number of macro-scale movement planning 
applications: 

~ The problem of moving a ship on the surface of an ocean or river though 
regions where the surface currents have known flow velocity. 

— The problem of moving a submarine though regions where the underwater 
currents have known flow velocity. 

— The problem of moving a aircraft though regions where the air currents have 
known flow velocity. 

The flow path problem becomes particularly relevant to these practical problems 
in the case where the object to be moved is under autonomous control, and where 
the flow velocities are significant to require careful motion planning. This is an 
increasing occurrence as new robotic devices are developed that are of a rapidly 
decreasing size, and hence these meso and micro-scale robots can be strongly 
influenced by the local flows in their environment. 
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1.2 Previous Work and Our Results 

Papadakis and Perakis PIMTT] previously gave heuristic algorithms for related 
problems such as for minimal time vessel routing in ocean currents. Sellen m 
studied the optimal route problem for a sailboat in a single region with multiple 
obstacles, where the velocity is a continuous function of sailing direction. There 
seems not to have been much other previous research on this problem, but there 
is considerable previous research on related movement problems. 

Reif m provided the first PSPACE hardness result for a robotic motion 
planning problem, and Schwartz and Sharir m gave motion planning algorithms 
using the theory of real closed fields (Canny 0). Reif and Sharir m gave 
algorithms and computational complexity lower bound results for robotic motion 
with moving obstacles (also see Wilfong fH])- 

Canny and Reif 0 showed the 3T> minimal cost path problem with polyg- 
onal obstacles is NP hard, and Reif and Storer m applied the theory of real 
closed fields to give a decision algorithm for this problem. The reference |Sj sur- 
veys work on the 2T> and 3T> minimal cost path problem, and approximation 
algorithms for the weighted region minimal cost path problem include the con- 
tinuous Dijkstra method of Mitchell and Papadimitriou P], as well as a variety 
of other discretization algorithms given by Mata and Mitchell [Zj , Lanthier et al 
p], Aleksandrov et al More recent works include Reif and Sun [11 tif 1 ,5j . and 
Aleksandrov et al m- 

In Section^ we have defined and motivated the flow path problem, and stated 
our results. In Section 0 we provide some preliminary results on the geometry 
of optimum paths for flow path problems, including a simple example where the 
2T> version of the for flow path problem consists of an exponential number of 
distinct segments through flow regions. In that section, we also provide a proof 
that the 32? version of the flow path problem is PSPACE hard, which is the 
first known hardness result for the computational complexity of this problem. 
In Sectional we provide the first known decision algorithm for the 22? flow path 
problem. This decision algorithm is of theoretical interest only, but is proved 
by a rather interesting and unique inductive argument that repeatedly makes 
use of root separation bounds derived from the theory of real closed fields. In 
Section 0 we provide the first known approximation algorithm for the 22? flow 
path problem, which is efficient for any given bounded error. In Section 0 we 
conclude the paper with some open problems. 

2 Preliminary and Lower Bound Results 

We first state some relevant properties of optimum paths for flow path problems 
within regions of translational flow: 

Proposition 1. The optimum path is a simple path: it does not self-intersect. 

Proposition 2. To move between any two points within the same flow region r, 
the optimum path is a straight line path with a control velocity of fixed direction 
and fixed maximum modulus br . 
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Next, we provide some lower bound results for optimum paths. First we give 
a simple construction of a family of flow problems where the optimum path 
contains an exponential number of distinct straight-line segments. 

Theorem 1. In the d = 2 dimensional version of the flow path problem with 
0(1) regions and with flow rates specified by n bits, the optimum path can consist 
of at least 2^^"^ distinct straight-line segments through flow regions. 

Proof: We construct a flow path problem where there are four unit square 
flow regions on the plane, each bordering the origin, and with unit magnitude 
flows that force a point robot with control velocity magnitude 2“"’ starting at 
the origin to spiral an exponential times around the origin before reaching the 
destination point (1,0). 

In particular, for i = 0, 1,2,3, define angle 9i = f7r/2 -|- tt/ 4 and let the ith 
flow region be a unit square centered at point (| cos(0i), | sin{9i)), with unit flow 
direction 9i tt/2. Thus the concatenation of the flows of the four squares run 
in counterclockwise fashion around the origin. Starting at the origin, the goal 
point (1,0) can only be reached by an exponential number of cycles by the point 
robot around the origin, where on each cycle the point can get an additional 
distance c2“" further from the origin, for a fixed constant c > 0. □ 

Next we prove that the d = 3 dimensional version of this problem is PSPACE 
hard. We use a construction given by Canny and Reif 0 in their proof that the 
problem of finding a minimal path in 32? between two points and avoiding a 
set of polygonal obstacles is NP hard. Given a Boolean formula with a list of 
N variables X, they construct a polynomial number of obstacles of polynomial 
size description, with a start point, an end point and an intended path length L 
such that the formula is satisfiable iff there is a path of distance < L between 
the start and end points avoiding these obstacles. The construction represents 
each possible variable assignment A of A by a (binary) encoding as a point 
p{A) on a particular segment of an obstacle; each point p{A) of the segment is 
reachable from the start point by path of distance at most L iff A is a satisfying 
assignment. 

We use their construction as follows: 

We begin with a linear space Turing Machine M with given binary input of 
length n. We can assume without loss of generality that M accepts the input 
only on computations of 2™ steps, for some constant c > 1. Let X,X' each be 
lists of N Boolean variables, where N = cn-l-O(l). We define a Boolean formula 
NEXT{X,X') which is true iff X,X' each encode valid configurations of M 
and the configuration associated with X' is reachable from the configuration 
associated with X by one step of the machine M. Then we apply the Canny 
and Reif ^ construction to form a set of obstacles in 32? of polynomial size 
with distinguished obstacle segments e, e' associated with variable lists X, X' 
respectively. Again, for any Boolean assignment A of A there is a point p{A) of 
e encoding A, and for any Boolean assignment A' of A' there is a point p'(A') of 
e' encoding A'. The Canny and Reif ^ construction ensures that point p'(A') of 
the segment e' is reachable from the point p(A) of the segment e in distance at 
most L' iff NEXT{X, A') holds for these Boolean variable assignments A, A' of 
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Boolean variable lists X, X' respectively. We define the modulus of the control 
velocity of the point robot to be unit within the free space contained within the 
set of obstacles, and the the modulus of the control velocity of the point robot 
to be 0 within the obstacles. 

We now provide for a translational “ribbon” of flow providing time paths 
of time duration < for a sufficiently large constant c' > 1 from points 

of e' back again to corresponding points of e. This ‘ribbon” has a cross-section 
that is a rectangle of unit length and < width. The purpose of this 

“ribbon” is to allow each of the single step transitions of the Turing machine to be 
repeated without changing the encoding of configurations. In particular, choosing 
a sufficiently large constant c" > 1, we define an additional polynomial sequence 
of connected rectangular flow strips (comprising the “ribbon” and external to 
the previously defined obstacles) each with a translational rate of flow 2'^ " along 
the “ribbon” . Then the time of the robot to move via this “ribbon” from a point 
p'{A') of e' to a p{A') of e is at most 2““'". 

Furthermore, we define the start point of the robot to be a point p{Aq) of e 
such that ^0 encodes the initial configuration of machine M , and we define the 
destination point of the robot to be a point p{Af) of e' such that Af encodes the 
accepting configuration of machine M. Then it follows by this construction that 
M has an accepting computation of 2 “” steps iff the flow path problem has an 
optimum path of time duration ^'(1-1-2““ ") for the point robot to move between 
the start and destination points. This construction can easily be shown to require 
O(logn) deterministic space to determine the specification of the resulting flow 
path problem. Hence we have shown: 

Theorem 2. The flow path problem in three dimensions is PSPACE hard with 
respect to log-space reductions. 



3 A Decision Algorithm for the 2X> Flow Path Problem 

Here we develop a decision algorithm for the two dimensional flow path problem. 
We can assume, with out loss of generality, that all flow regions are triangles. 

We first state the following two lemmas (refer to Figure [Hb) for two dimen- 
sional flow path problems. We will apply them to develop a decision algorithm. 

Lemma 1. Let r = A ABC be a region with flow fr- Let p be a point on AB 
with distance d to A and p' be a point on AC with distance d' to A. Let a be 
the angle between AB and fr and let 0 be the angle between pp' and AB. Let pr 
be the ratio of br and \fr\. Then the optimum path from p to p' can be achieved 
by the point robot adopting a velocity with maximum magnitude and an angle of 
= arcsin( ‘^™^^~^^ ) from pp' . 



Lemma 2. Let fl be the angle between AB and AC. Then the optimum path 



from p to p' inside r has a cost of t = 



where I = \pp'\ = 



br{-^P-Tl+T2) ’ 

^Jd?■ -\- d'^ — 2dd' cos fl, Ti = (dsina — d'sin(o: -I- j3))!pr and T 2 = {dcosa — 
d' cos(a -I- fl))l Pr. 
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The next theorem directly follows Lemma |2 

Lemma 3. Let Sr{d,d') be an optimum path segment for the robot within a 
convex flow region r that begins at a distance d along a boundary edge e\ of r, 
and ends at a distance d' along a boundary edge 62 ofr. Then there is a formula 
F{t,x,x') of the existential theory of real closed fields (with free variables t,x,x' 
and involving only a constant number of other variables; furthermore with a 
constant number of terms and with constant rational coefficients of most 0{n) 
size), such that F{T,d,d') is true iff Sr{d,d') has time duration t. 

The existential theory of real closed fields is the logical system consisting of exis- 
tentially quantified formulas, whose variables range over the real numbers, and 
whose formulas are constructed of inequalities of rational forms (these rational 
forms are arithmetic expressions involving these real variables and fixed rational 
constants which may be added and multiplied together) and the usual Boolean 
logical connectives AND, OR, NOT. Collins jS| gave a decision procedure for the 
existential theory of real closed fields that was improved by Canny to run in 
polynomial space: 

Lemma 4. Given a formula of the existential theory of real closed fields of length 
n, the formula can be decided in space and time, and the existentially 

quantified variables can be determined, up to exponential bit precision, within this 
computational complexity. 

Collins jS| proved a useful Lemma as a byproduct of his decision procedure: 

Lemma 5. If the solution of a formula of length n in the existential theory of 
real closed fields is not the zero vector 0 , then it is of modulus at least 2~^ , for 
some constant c > 0. 

Now consider an optimum path S{d,d') for the robot that begins at a distance 
d along a boundary edge e of a flow region r, and ends at a distance d' along the 
same boundary edge e of r, and does not visit any points of e between these, but 
may pass through / (where / counts the repetitions) other flow regions. Then 
S{d, d') consists of at most 0{f) straight-line segments, though flow regions. By 
Lemma 0 we have: 

Lemma 6 . There is a formula F' (t,x,x') of the existential theory of real closed 
fields (with free variables t,x,x' and involving only 0(f) other variables, and 
with a quadratic number of terms, and with constant rational coefficients of most 
0{n) size), such that F'{T,d,d') is true iff S{d,d') has time duration t. 

We now define a hierarchy of paths of increasing complexity (this term should not 
be confused with the usual notion of computational complexity). Let a path have 
complexity 0 if it visits no flow region boundary edge more than once, except 
possibly at the start and final points of the path. Let a path p have complexity 
k if it does not have complexity < k and p = qoPiqi . . . ,Phqh where each pi 
is a path that passes through only one flow region, and each qi is a path of 
complexity < k that begins and ends at the same flow region boundary edge. 
The following can be proved by induction on the number of distinct flow edges: 
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Proposition 3. The maximum complexity number of an optimum path in any 
2D flow path problem with n triangular flow regions is at most 

We will derive, for any optimum path of complexity 0, a finite bound on the 
number of straight-line segments through flow regions. Consider an optimum 
path S{d, d!) of complexity 0 that begins and ends at a distance d, d' respectively 
along a flow region boundary edge e of a flow region r. Applying Lemma 0 to 
Lemmas 0and0 we have: either d = d' or \d' — d\ > 2~^ , for some constant 
c > 0. This implies: 

Lemma 7. Any optimum path of complexity 0 between two points consists of at 
most 2^ ^ * straight-line segments through flow regions. 

In the following, let Eo(n) = n and for k > 0, let Ek{n) = 

Now suppose as an inductive assumption that, for some fc > 0, that any 
optimum path of complexity k' < k between two points consists of at most 
E2k'{0{n)) straight-line segments through flow regions. Applying LemmaEl we 
can construct an existentially quantified formula of the theory of real closed 
fields with E 2 k'{ 0 (n)) variables, which is true iff there is a optimum path of 
complexity k' and of length r between the initial and final points. 

Consider an optimum path S{d,d') of complexity k that begins and ends 
at a distance d, d' respectively along a flow region boundary edge e of a flow 
region r. Recall that since S{d,d') has complexity k, it can be decomposed as 
doPiQi ■ ■ ■ ,Ph<lh where each pi is a path that passes through only one flow region, 
and each qi is a path of complexity < k that begins and ends at the same flow 
region boundary edge. Applying again Lemma 0 to Lemmas 0 and 0 we now 
have: either d = d' or \d' — d| > 2“^ ^ 2 (fc-i)( < )» ^ \lE2k{0{n)). Hence we 
have that: 

Lemma 8. Any optimum path of complexity k between two points consists of at 
most E 2 k{ 0 (nf) straight-line segments through flow regions. 

Then applying the Canny 0 decision procedure (Lemma 0 , we have that 
there is an algorithm that decides, within E 2 k{ 0 {n)) space and 2‘^(^^'“^‘^^"))Hime, 
the two dimensional flow path problem for optimum paths of complexity k. By 
Proposition 0 (which bounds the complexity number of an optimum path to be 
nOW) , we have one of our main results: 

Theorem 3. There is an algorithm, with E2k{0{n)) space and 

time cost, for the two dimensional flow path decision problem of size n, where 

k < is the minimum complexity of an optimum path. 

4 An Efficient e-Short Approximation Algorithm 

4.1 Approximation Algorithm Based on Discretization 

The significance of the above algorithm is that the problem is decidable, but its 
complexity is far too high for practical implementation. A natural strategy to 
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approximately solve the 2T> flow problem is to discretize the polyhedral decom- 
position of the 2T> space by inserting Steiner points on edges. A directed discrete 
graph G is then constructed by interconnecting each pair of Steiner points or ver- 
tices on the boundary of the same region. Each edge (pi,p 2 ) connecting Steiner 
point (or vertex) pi to p 2 is assigned a weight that is equal to r(pi,p 2 )- 

Dijkstra’s algorithm can be used to find the shortest path from the given 
source point s to the destination point t in G. This shortest path found in G 
can be used to approximate the “true” optimum path in the original continuous 
space. This approach has been used in approximating the geometric shortest 
path in the 2T> or 32? space, or the optimum path in weighted regions in 22? 
space. 

The basic discretization scheme is to place m Steiner points uniformly on 
each edge, for some positive integer m. In G there will be 0{nm) points 
and 0{nm?) edges and thus the time complexity of Dijkstra’s algorithm is 
0{nm? + nmlog(nm)). Using a discretization algorithm done by Sun and Reif 
(|1 5j) named BUSHWHACK, the optimum path in G can be computed in 0{nm 
log(nm)) time without visiting all edges. This scheme is easy to implement, al- 
though no error bound can be guaranteed for the approximate optimum path. 



4.2 Non-uniform Discretization 

The above approximation algorithm can be improved by adopting a non-uniform 
discretization. This discretization scheme is similar to the one introduced by 
Aleksandrov et al for the weighted region optimum path problem j I I'Zj . Instead 
of placing Steiner points with even spacing on each edge, we place Steiner points 
with higher density in the portions of the edge that are closer to the two end 
points. The discretization scheme assumes that 1 for any region r. 

Before we describe the new discretization scheme, we first introduce some 
notations that will be used. These notations are, collectively, TVim (p) ) Pe j and 

Rv 

Let p be an arbitrary point on the boundary of some region r. We define 
the vicinity of p, I^, to be the union of all boundary edges each of which is 
not incident to p yet is on the same region boundary as p. We define Tmm(p) = 
inf{min{r(p, ri), t(z;,p)} | v S V),}}- In the full version of the paper we show 
that Tmin{p) can be computed in constant time for each p. Observe that Tmin{p) 
represents the minimum cost of traveling in a straight line path between p and 
any point on the boundary of the union of all regions incident to p. 

For each edge e, we let Tjnax{e) = sup{rmj„(p) | p S e}. Tmax{e) is a constant 
that is decided by the geometric properties as well as fr and bj. of each region 
r incident to e. We let pe denote a point on e such that Tmin{Pe) = Tmax{e). 
Tmin(^) and Pe Can be determined in constant time. 

Let ri and V 2 be the two regions incident to e, and let oi, a 2 be 
the angles between V 2 V 1 and fx 2 respectively. We define = 

max i\AZ - sin^ai • |/,.J2 +|/nl -cosai, - sin^ 02 • |/r 2 p + |/r 2 1 • cos a 2 }- 
by^^y^ is the “effective velocity” of a robot traveling on edge e from v\ to V 2 as 
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it takes time to travel distance d on e following LemmaEl Similarly, we 

can define and let = min{&^j_„ 2 , Note that be is the minimum 

effective velocity of the robot traveling on edge e in any of the two directions. 

For any vertex v and any triangular region r incident to v, we define 
dmin(v,r) to be the minimum Euclidean distance from v to the edge of r 
that is not incident to v. We define the radius of v to be the following: 
Ry — min{ | r is incident to u}. Observe that while Tmin(v) is the mini- 
mum cost of one-segment path between v and any point on the boundary of the 
union of regions incident to v, Ry provides a lower bound on the minimum cost 
of any path between v and any point on the boundary of the union. 

For any edge e = V 1 V 2 , Pe divides edge e into two segments, viPe and V 2 Pe- 
In the following we describe the placement of Steiner points • • • ,Pi,k on 

each segment ViPe for i = 1,2. The first Steiner point pi^i is placed on segment 
ViPe with distance beRy-c to vertex Vi. The subsequent Steiner point pi j is placed 
between Pi andpe with distance d}eTmin{Pi,j-i) topij_i. We continue adding 
Steiner points until no more Steiner point can be added on ViPe- That is, if pi j 
is the last Steiner point added, no more Steiner point is inserted on segment ViPe 
if |zlpij | -I- ebeTmin{Pi,j) > |uiPe|. Finally, we add Pe as a Steiner point on e. 

To give an upper bound for the number of Steiner points on each edge e, we 
introduce three parameters Pmin and A. We let dmin be the smallest angle 
of any triangular region and let Pmin be the minimum by among all regions. 
We define A to be max{6j./&r'|regions r and r' are adjacent}. A indicates how 
drastic the velocity bound of the robot can change from one region to another. In 
practice, since the triangular decomposition is usually a result of discretization, 
A is not very large in many cases. Further, we let Cskew = sineJ.CTlPm l-i) 
call it the “skewness” parameter of the space. We have the following theorem 
(we include the proof in the full version of the paper): 



Theorem 4. For the non-uniform discretization scheme, the total number of 
Steiner points added into the triangular decomposition is O ( log 
where n is the number of triangular regions in the decomposition. 



The dependence of Cskew on A implies that the variations of the velocity 
bound of the robot in different regions will affect the complexity of this approx- 
imation algorithm. If the velocity bound is uniform in all regions, less Steiner 
points are needed to guarantee an e-approximation. On the contrary, if the veloc- 
ity bound changes drastically in adjacent regions, it will take much more Steiner 
points to achieve the same approximation bound. 



4.3 Discrete Path 

The discrete graph G constructed from the discretization described 
above has log points and log 

edges. The time complexity of computing the optimum path in G is 
Qj^ Cske,y-n ^ Csken, ^Qg Cskew _|_ log Ti) log ) by Dijkstra’s algorithm, or 

0( ^‘’'"‘"'” (log -l-logn) log ) by the BUSHWHACK algorithm. In this 
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(a) Adding Steiner Points on 
Edge 



(b) Discrete Path that Neigh- 
bors a Linear Path 



Fig. 2. 



subsection we analyze how well a discrete path can approximate an optimum 
path in the original continuous space. 

As we have mentioned earlier, an optimum path popt from a given source point 
s to a given destination point t in the original continuous space is piecewise linear. 
It only bends on the boundary of regions. We establish the following theorem 
that provides a bound on the error of using a discrete path to approximate an 
optimum path. 

Theorem 5. For any such linear path p, there exists a discrete path p' such that 



Here ||p|| denotes the cost of path p. To prove this theorem, we first describe 
how to construct a discrete path p' that “neighbors” a given linear path p, and 
then we show that the cost of such a path p' is no more than (1 + 9e)||p|| by 
using several lemmas. 

Let s = oi, 02, • • • , Ofc-i, Ofc = t he the points where the linear path p bends, 
as shown in Figure |2|b. We need to construct a discrete path with bending 
points s = &i, 62, • • • , bk-i, bf; = t such that each bi is either a Steiner point or a 
vertex. As path p only bends on the boundary of regions, each bending point Ui 
is either between two Steiner points on the same edge, or between a vertex and 
its neighboring Steiner point. 

Suppose that Oi is between two Steiner points pj,pj+\ on edge e = Vi^Vi^. 
Without loss of generality, suppose that pj is between Pj+i and Vi-^. Then we 
choose bi to be pj if pj is between Vi-^ and Pe', if otherwise, we choose bi to be 
Pj+i- If Oi is between vertex and Steiner point pi on edge e = we 

choose bi to be Vi^ . 

To evaluate ||p'||, we need to estimate the cost of each segment bjbj+i. We can 
compare ||6j6_,+i|| against ||aJOj+T|| in various situations by using this lemma: 

Lemma 9. Assuming e < if both bj and are Steiner points, then 

||6j6j_|_i|| < (1 + 3e)||aJOj+i|| • Also, if one of bj and bj+i is a vertex v of 



lb'll <(l + 9e)|b||. 



460 



J. Reif and Z. Sun 



the triangular decomposition, then ||6j6j+i|| < (1 + |e)||6j6j+i|| + eRv Fur- 
ther, if both bj and bj^i are vertices of the triangular decomposition, then 
||6j6j_|_i|| < llajOj+ill + eR„' + cR^" , where v' = bj and v" = &j+i- 

With this lemma, we are ready to prove Theorem El 
Proof of Theorem Among the bending points 6 i, 62 , • • • , bk-i, of p' , let 
bi^,bi^, - ■ ■ , bi^ be vertices of the triangular decomposition, where I = ii < i 2 < 
■ ■ ■ < id -1 < id = k. Let bi^ = Vj for 1 < j < d. We have ||p'|| = J2j=i \\bjbj+i\\ 
< (l + 3e) J2^j=i l|oj“j+i|l+ +2 Rvj +Rva) = (l + 3e)||p||+ e-{Ry^ + 

2 R'’j + Rvj)- For each j, 1 < j < d, let Cj = ^ ajOj+i. That is, Cj 

is the sub-path of p between and Let r be the region incident to bi^, 

Ui- and and let r' be the region incident to and at-j^^. As 

bi.ai.+Cj + ai.^,^bi.^^ is a path from vertex Vj to Vj+i, we have Wbi^a^. ||r+ ||C'j ||-|- 
||a* 3 +i&i 3 +il|r' > Rvj as well as Wh^Oi^Wr + IlCjH-f > Ry^+i- Here 

||p||r is the cost of path p measured by the cost function of r. As ||j. < 

eRy. and \\ai.^^b,.^^\\r, < eRy.^^, we can get ||Cj|| > (1 - 2e){Ry. -f i?„j+J/2 
and therefore + 2 Y,jZ 2 R-v^ + Rvj, = J2‘jZl(Rvj + Rvj+i) < IZjZi T^C'j 
= T^lbll < 6|b||- Hence |b'|| < (1 -f 3e)||p|| -f 6 e||p|| = (1 -f 9e)||p||. □ 

Combining Theorem 0 and El we have the second main result of this paper: 

Theorem 6. An e-short approximation of the optimum path in regions with 
flows can be computed in {log -f log n) log ) time, where n 

is the number of regions. 

5 Conclusion 

While we have provided the first decision procedure for the two dimension flow 
path problem, the complexity of our algorithm appears far too high to have 
practical use. There is no known lower bound for this two dimensional version 
of the flow path problem. Furthermore, there is no decision algorithm for the 
three dimensional version of the flow path problem. It remains an open problem 
to determine a more exact complexity bound for the two and three dimensional 
flow problems. 
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Abstract. We show that, for any n- vertex graph G and integer pa- 
rameter fc, there are at most 34fc-"4’^-3fc maximal independent sets 
I <Z G with |/| < k, and that all such sets can be listed in time 
Qf^ 2 ,Ak-n^n-iky "pjjggg bounds are tight when n/4 < k < n/3. As a 
consequence, we show how to compute the exact chromatic number of 
a graph in time 0((4/3 -b 3"‘/®/4)") « 2.4150", improving a previous 
0{{1 -b 3^/®)") « 2.4422" algorithm of Lawler (1976). 

1 Introduction 

One of the earliest works in the area of worst-case analysis of NP-hard problems 
is a 1976 paper by Lawler [5] on graph coloring. It contains two results: an 
algorithm for finding a 3-coloring of a graph (if the graph is 3-chromatic) in time 
0(3"/^) « 1.4422", and an algorithm for finding the chromatic number of an 
arbitrary graph in time 0((l-b3^/^)") « 2.4422". Since then, the area has grown, 
and there has been a sequence of papers improving Lawler’s 3-coloring algorithm 
[1,2, 4, 7], with the most recent algorithm taking time « 1.3289". However, there 
has been no improvement to Lawler’s chromatic number algorithm. 

Lawler’s algorithm follows a simple dynamic programming approach, in which 
we compute the chromatic number not just of G but of all its induced subgraphs. 
For each subgraph S, the chromatic number is found by listing all maximal inde- 
pendent subsets I C S, adding one to the chromatic number of S\I, and taking 
the minimum of these values. The 0{{! + 3^/^)") running time of this technique 
follows from an upper bound of 3"/^ on the number of maximal independent 
sets in any n- vertex graph, due to Moon and Moser [6]. This bound is tight in 
graphs formed by a disjoint union of triangles. 

In this paper, we provide the first improvement to Lawler’s algorithm, using 
the following ideas. First, instead of removing a maximal independent set from 
each induced subgraph S, and computing the chromatic number of S from that 
of the resulting subset, we add a maximal independent set of G\S' and compute 
the chromatic number of the resulting superset from that of S. This reversal does 
not itself affect the running time of the dynamic programming algorithm, but 
it allows us to constrain the size of the maximal independent sets we consider 
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to at most |5'|/3. We show that, with such a constraint, we can improve the 
Moon-Moser bound: for any n-vertex graph G and integer parameter k, there 
are at most ^-ik-n^n-3k jj^aximal independent sets I C G with |/| < k. This 
bound then leads to a corresponding improvement in the running time of our 
chromatic number algorithm. 

2 Preliminaries 

We assume as given a graph G with vertex set V{G) and edge set E{G). We 
let n = |V^(G)| and m = |if(G)|. A proper coloring of G is an assignment of 
colors to vertices such that no two endpoints of any edge share the same color. 
We denote the chromatic number of G (the minimum number of colors in any 
proper coloring) by x(G). 

If V (G) = {uo, vi, ■ ■ ■ Vn-i}, then we can place subsets S GV (G) in one-to- 
one correspondence with the integers 0, 1, ... 2” — 1: 

S'O ^ 2b 

ViGS 

Subsets of vertices also correspond to induced subgraphs of G, in which we 
include all edges between vertices in the subset. We make no distinction between 
these three equivalent views of a vertex subset, so e.g. we will write x{S) to 
indicate the chromatic number of the subgraph induced by set S, and A [S'] to 
indicate a reference to an array element indexed by the number We 

write S < T to indicate the usual arithmetic comparison between two numbers, 
and S C T to indicate the usual (proper) subset relation between two sets. Note 
that, if S C T, then also S < T, although the reverse implication does not hold. 

A set S is a maximal k-chromatic subset of T if S C T, x(S) = k, and 
x(S') > k for every S C S' C T. In particular, if fc = 1, S is a maximal 
independent subset of T. 

For any vertex v G V{G), we let N{v) denote the set of neighbors of v, 
including v itself. If S and T are sets, S\T denotes the set-theoretic difference, 
consisting of elements of S that are not also in T. Ki denotes the complete graph 
on i vertices. We write deg(u, S) to denote the degree of vertex v in the subgraph 
induced by S. 

We express our pseudocode in a syntax similar to that of C, C-|— 1-, or Java. 
In particular this implies that array indexing is zero-based. We assume the usual 
RAM model of computation, in which a single cell is capable of storing an integer 
large enough to index the memory requirements of the program (thus, in our case, 
n-bit values are machine integers), and in which arithmetic and array indexing 
operations on these values are assumed to take constant time. 
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3 Small Maximal Independent Sets 

Theorem 1. Let G he an n-vertex graph, and k he a nonnegative number. Then 
the number of maximal independent sets I C V{G) for which |/| < k is at most 

^4k—n^n—3k 

Proof. We use induction on n; in the base case n = 0, there is one (empty) 
maximal independent set, and for any A: > 0, 1 < 34*4-3^ _ (gl/64)^. Otherwise, 
we divide into cases according to the degrees of the vertices in G, as follows: 

— If G contains a vertex v of degree three or more, then each maximal indepen- 
dent set / either contains v (in which case / \ {w} is a maximal independent 
set of G\ N{v)) or it does not contain v (in which case / itself is a maximal 
independent set of G \ {n}). Thus, by induction, the number of maximal 
independent sets of cardinality at most k is at most 

^4/i:— (n— l)^(n— 1) — 3fc _j_ '^4:{k—l) — {n—4)^(n—4) — 3{k—l) 

Q 1 

/_ _|_ _\^4k—n^n—3k ^4k—n^n—3k 

~ 4 4 ~ 

as was to be proved. 

— If G contains a degree-one vertex v, let its neighbor be u. Then each maximal 
independent set contains exactly one of u or v, and removing this vertex from 
the set produces a maximal independent set of either G\ N{u) or G\N{v). 
If the degree of u is d, this gives us by induction a bound of 

24(fc — 1) — (n— 2)^(n— 2)— 3(fc— 1) _j_ 1) — 1) — 3(fc — 1) 

o 

^ _ o4fc— n^n— 3fc 

“ 9 

on the number of maximal independent sets of cardinality at most k. 

— If G contains an isolated vertex v, then each maximal independent set con- 
tains V, and the number of maximal independent sets of cardinality at most 
k is at most 

o4(/i:— 1) — (n— l)^(n— 1)— 1) o4/c— 

— If G contains a chain u-v-w-x of degree two vertices, then each maximal 
independent set contains u, contains v, or does not contain u and contains 
w. Thus in this case the number of maximal independent sets of cardinality 
at most k is at most 

2 ^ 24(/c— 1) — (n— 3)^(n— 3) — 3(/i:— 1) _j_24(fc — 1) — (n— 4)^(n— 4)— 3(fc — 1) ^ ^ ^4k—n^n—3k 

— In the remaining case, G consists of a disjoint union of triangles, all maxi- 
mal independent sets have exactly n/3 vertices, and there are exactly 3"/^ 
maximal independent sets. If fc > n/3, then 3"/^ < 34fe-n4"-3fc if < n/3, 
there are no maximal independent sets of cardinality at most k. 
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// List maximal indepoTideiit subsets of S smaller tiiari a given parameter. 

// 5 is a set of vertices forming an induced subgraph in G, 

j j 7 is a set of vertices to be included in the MIS (initially zero), fuad 

// k bounds the number of vertices of 5 to add to /. 

If We call processMlS(/) on each generated set. Some non-maximcil sets may be 
j j generated along with the maximal ones, but all generated sets are independent. 

void smalLVllS (set S, set 7, int k) 

{ 

if (S = 0 or fc = 0) procGssMlS(I); 

else if (there exists v £ S with deg(7x S) > 3) 

{ 

smallMIS (5\{j)}, f, fc); 
smallMIS (S' \ N{v), I U {e}, k - 1); 

} 

else if (there exists v £ S with deg(?;. S) = 1) 

{ 

let u be the neiglibor of v, 
smallMIS {S \ N{u), I U {-«}, fc - 1); 
smallMIS (S \ N{v). I U {r;}, k - 1); 

} 

else if (there exists v G S with dcg(?;, 5) = 0) 
smallMIS (5 \ {r’}, ^ U {v}, k - \ 
else if (some cycle in S is not a triangle or k > |5'|/3) 

{ 

let u, n, and w be adjacent degree-two vertices, such that (if possible) u and w are nonadjacent; 
smallMIS {S \ N{u), I U {?/}, k - 1); 
smallMIS (S \ N{vi I U {n}, - I); 

smallMIS (S \ ({u} U N{w)), I U {w}, k - 1); 

} 



Fig. 1. Algorithm for listing all small maximal independent sets. 



Thus in all cases the number of maximal independent sets is within the 
claimed bound. □ 

Croitoru [3] proved a similar bound with the stronger assumption that all 
maximal independent sets have |/| < k. When n/4 < k < n/3, our result is 
tight, as can be seen for a graph formed by the disjoint union of 4/c — n triangles 
and n — 3k K 4 S. 

Theorem 2. There is an algorithm for listing all maximal independent sets 
smaller than k in an n-vertex graph G, in time 

Proof. We use a recursive backtracking search, following the case analysis of 
Theorem 1: if there is a high-degree vertex, we try including it or not including 
it; if there is a degree-one vertex, we try including it or its neighbor; if there is 
a degree-zero vertex, we include it; and if all vertices form chains of degree- two 
vertices, we test whether the parameter k allows any small maximal independent 
sets, and if so we try including each of a chain of three adjacent vertices. The 
same case analysis shows that this algorithm performs recursive 

calls. 
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int cliromaticNumbcr (graph G) 

{ 

int X[2"]; 

for (5 = 0: S' < 2"-; S++) 

{ 

if (x(S) <3) X[S] =x[S]; 
else X[S] = oc; 

} 

for (S = 0: S < 2"-: S++) 

{ 

if (3 <X[S]< oc) 

{ 

for (each nifiximal independent set / of G \ S with |/| < 
X|SU /| - miii(X[SU f].X|S] + 1); 

} 

} 

return X[V{G)]: 



|g| 

X[S] 



) 



Fig. 2. Algorithm for computing the chromatic number of a graph. 



Each recursive call can easily be implemented in time polynomial in the 
size of the graph passed to the recursive call. Since our jg 

exponential in n, even when k = 0, this polynomial overhead at the higher levels 
of the recursion is swamped by the time spent at lower levels of the recursion, 
and does not appear in our overall time bound. □ 

A more detailed pseudocode description of the algorithm is shown in Figure 1. 
The given pseudocode may generate non-maximal as well as maximal indepen- 
dent sets, because (when we try not including a high degree vertex) we do not 
make sure that a neighbor is later included. This will not cause problems for our 
chromatic number algorithm, but if only maximal independent sets are desired 
one can easily test the generated sets and eliminate the non-maximal ones. The 
pseudocode also omits the data structures necessary to implement each recursive 
call in time polynomial in [S'! instead of polynomial in the number of vertices of 
the original graph. 

4 Chromatic Number 

We are now ready to describe our algorithm for computing the chromatic number 
of graph G. We use an array X, indexed by the 2" subsets of G, which will 
(eventually) hold the chromatic numbers of certain of the subsets including V (G) 
itself. We initialize this array by testing, for each subset S, whether x(S') < 3; if 
so, we set X[S'] to x{S), but otherwise we set X[5'] to oo. 

Next, we loop through the subsets S of V{G), in numerical order (or any 
other order such that all proper subsets of each set S are visited before we visit 
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S itself). When we visit S, we first test whether X[5'] > 3. If not, we skip over S 
without doing anything. But if X[5'] > 3, we loop through the small independent 
sets of G\S, limiting the size of each such set to |S'|/X[S'], using the algorithm of 
the previous section. For each independent set I, we set X[S'U/] to the minimum 
of its previous value and X[S'] + 1. 

Finally, after looping through all subsets, we return the value in X\V{G)] as 
the chromatic number of G. Pseudocode for this algorithm is shown in Figure 2. 

Lemma 1. Throughout the course of the algorithm, for any set S, ^[S'] > x(5'). 

Proof. Clearly this is true of the initial values of X . Then for any S and any 
independent set I, we can color S' U / by using a coloring of S and another color 
for each vertex in I, so x(S U /) < x{S) + 1 < -^[>5'] + 1, and each step of our 
algorithm preserves the invariant. □ 

Lemma 2. Let M he a maximal k + 1-chromatic subset of G, and let (S,I) 
be a partition of M into a k-chromatic subset S and an independent subset I, 
maximizing the cardinality of S among all such partitions. Then I is a maximal 
independent subset ofG\S with |/| < \S\/k, and S is a maximal k-chromatic 
subset of G. 

Proof. If we have any (fc+I)-coloring of G, then the partition formed by separat- 
ing the largest k color classes from the smallest color class satisfies the inequality 
|.^| ^ \S\/k, so clearly this also is true when (S, /) is the partition maximizing 
|S|. If / were not maximal, due to the existence of another independent set 
I C I' C G \ S, then SUI' would be a larger (fc -|- l)-chromatic graph, violating 
the assumption of maximality of M . 

Similarly, suppose there were another /c-chromatic set S G S' G G. Then if 
S" n / were empty, S" U / would be a (A: -I- l)-chromatic superset of M, violating 
the assumption of M’s maximality. But if S"nl were nonempty, {S' , I\S') would 
be a better partition than (S',/), so in either case we get a contradiction. □ 

Lemma 3. Let M he a maximal k 1-chromatic subset of G. Then, when the 
outer loop of our algorithm reaches M, it will be the case that Jf[M] = x(M). 

Proof. Clearly, the initialization phase of the algorithm causes this to be true 
when x(^) ^ 3. Otherwise, let (S, /) be as in Lemma 2. By induction on |M|, 
X[S] = x(S) at the time we visit S. Then X[S] > 3, and |/| < |S|//f[S], so the 
inner loop for S will visit / and set /f[M] to /f[S] -I- 1 = x(-^)- C 

Theorem 3. We can compute the chromatic number of a graph G in time 
e>((4/3 -k 34 / 3 / 4 )”) and space 0(2”). 

Proof. V{G) is itself a maximal x(G)-chromatic subset of G, so Lemma 3 shows 
that the algorithm correctly computes x{G) = ^\^{G)\. Clearly, the space is 
bounded by 0(2”). It remains to analyze the algorithm’s time complexity. 
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void color (graph G) 

{ 

compute array X as in Figure 2; 

S = V{G); 

for (T = 2" - 1: T > 0: T ) 

{ 

if (T C S' and X[S \ T] = 1 and X[T] = [S] - 1) 

{ 

color all vertices in S \ IT with the same new color; 
S = T; 

} 

} 

} 



Fig. 3. Algorithm for optimally coloring a graph. 



First, we consider the time spent initializing X. Since we perform a 3-coloring 
algorithm on each subset of G, this time is 

C>(1.3289l'^l) = (”jl.3289*) =0(2.32890. 

SgV{G) i=0 

Finally, we bound the time in the main loop of the algorithm. We may pos- 
sibly apply the algorithm of Theorem 2 to generate small independent subsets 
of each set G\ S. In the worst case, A1[S] = 3 and we can only limit the size of 
the generated independent sets to |S|/3. We spend constant time adjusting the 
value of AT [Sul] for each generated set. Thus, the total time can be bounded as 

C>(34T-|G\S|4|G\S|-3l|l) ^ 

This final term dominates the overall time bound. □ 

5 Finding a Coloring 

Although the algorithm of the previous section finds the chromatic number of 
G, it is likely that an explicit coloring is desired, rather than just this number. 
The usual method of performing this sort of construction task in a dynamic 
programming algorithm is to augment the dynamic programming array with 
back pointers indicating the origin of each value computed in the array, but since 
storing 2" chromatic numbers is likely to be the limiting factor in determining 
how large a graph this algorithm can be applied to, it is likely that also storing 
2” set indices will severely reduce its applicability. 

Instead, we can simply search backwards from V (G) until we find a subset 
S that can be augmented by an independent set to form V(G), and that has 
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chromatic number x{^) = x(G) — 1 as indicated by the value of X[S']. We assign 
the first color to G \ S'. Then, we continue searching for a similar subset T C S, 
etc., until we reach the empty set. Although not every set S may necessarily 
have A[S] = x(S), it is guaranteed that for any S we can find T C S with 
S\T independent and X[T] = A[S] — 1, so this search procedure always finds 
a correct coloring. 

Theorem 4. After computing the array X as in Theorem 3, we can compute 
an optimal coloring of G in additional time 0(2") and no additional space. 

We omit the details of the correctness proof and analysis. A pseudocode 
description of the coloring algorithm is shown in Figure 3. 



6 Conclusions 

We have shown a bound on the number of small independent sets in a graph, 
shown how to list all small independent sets in time proportional to our bound, 
and used this algorithm in a new dynamic programming algorithm for computing 
the chromatic number of a graph as well as an optimal coloring of the graph. 

Our bound on the number of small independent sets is tight for n/4 < fc < 
n/3, and an examination of the analysis of Theorem 3 shows that independent 
set sizes in this range are also the ones leading to the worst case time bound 
for our coloring algorithm. Nevertheless, it would be of interest to find tight 
bounds on the number of small independent sets for all ranges of k. It would 
also be of interest to find an algorithm for listing all small maximal independent 
sets in time proportional to the number of generated sets rather than simply 
proportional to the worst case bound on this number. 

Our worst case analysis of the chromatic number algorithm assumes that, 
every time we call the procedure for listing small maximal independent sets, this 
procedure achieves its worst case time bound. But is it really possible for all sets 
G\ S' to be worst case instances for this procedure? If not, perhaps the analysis 
of our coloring algorithm can be improved. 

Can we prove a bound smaller than (”) on the number of i-vertex maximal 
/c-chromatic induced subgraphs of a graph G? If such a bound could be proven, 
even for fc = 3, we could likely improve the algorithm presented here by only 
looping through the independent subgraphs of G \ S when S is maximal. 

An alternative possibility for improving the present algorithm would be to 
find an algorithm for testing whether x(G) < 4 in time o(1.415”). Then we 
could test the four-colorability of all subsets of G before applying the rest of 
our algorithm, and avoid looping over maximal independent subsets of G \ S' 
unless A[S] > 4. This would produce tighter limits on the independent set sizes 
and therefore reduce the number of independent sets examined. However such 
a time bound would be significantly better than what is currently known for 
four-coloring algorithms. 
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Abstract. Even though a large number of I/O-efficient graph algo- 
rithms have been developed, a number of fundamental problems still 
remain open. For example, no space- and I/O-efficient algorithms are 
known for depth-hrst search or breadth-first search in sparse graphs. In 
this paper we present two new results on I/O-efScient depth-hrst search 
in an important class of sparse graphs, namely undirected embedded pla- 
nar graphs. We develop a new efhcient depth-hrst search algorithm and 
show how planar depth-hrst search in general can be reduced to planar 
breadth-hrst search. As part of the hrst result we develop the hrst I/O- 
efhcient algorithm for hnding a simple cycle separator of a biconnected 
planar graph. Together with other recent reducibility results, the sec- 
ond result provides further evidence that external memory breadth-hrst 
search is among the hardest problems on planar graphs. 



1 Introduction 

External memory graph algorithms have received considerable attention lately 
because massive graphs arise naturally in many applications. Recent web crawls, 
for example, produce graphs with on the order of 200 million nodes and 2 billion 
edges, and recent work in web modeling uses depth-first search, breadth-first 
search, shortest path computation and connected component computation as 
primitive routines for investigating the structure of the web j3- Massive graphs 
are also often manipulated in Geographic Information Systems (GIS), where 
many common problems can be formulated as basic graph problems. Yet an- 
other example of a massive graph is AT&T’s 20TB phone-call data graph |7|. 
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When working with such massive graphs the I/O-communication, and not the in- 
ternal memory computation time, is often the bottleneck. Designing I/O-efficient 
algorithms can thus lead to considerable runtime improvements. 

Breadth-first search (BFS) and depth-first search (DFS) are the two most 
fundamental graph searching strategies. They are extensively used in many 
graph algorithms. The reason is that both strategies can be implemented in 
linear time in internal memory; still they reveal important information about 
the structure of the input graph. Unfortunately, no I/O-efhdent BFS or DFS 
algorithms are known for arbitrary sparse graphs, while known algorithms per- 
form reasonably well on dense graphs. In this paper we consider an important 
class of sparse graphs, namely undirected embedded planar graphs. This class is 
restricted enough to hope for more efficient algorithms than for arbitrary sparse 
graphs. Several such algorithms have indeed been obtained recently E1IS|. We 
develop an improved DFS algorithm for planar graphs and show how planar DFS 
can be reduced to planar BFS. Since several other problems on planar graphs 
have also been shown to be reducible to BFS, this provides further evidence that 
in external memory BFS is among the hardest problems on planar graphs. 



1.1 I/O-Model and Previous Results 

We work in the standard two-level I/O model with one (logical) disk PJ (our 
results work in a D-disk model; but for brevity we only consider one disk in this 
extended abstract). The model defines the following parameters: 

N = number of vertices and edges {N = \V\ -I- \E\), 

M = number of vertices/edges that can fit into internal memory, 

B — number of vertices/edges per disk block, 

where 2B < M < N. An Input/Output operation (or simply I/O) involves 
reading (or writing) a block from (to) disk into (from) internal memory. Our 
measure of performance of an algorithm is the number of I/Os it performs. The 
number of I/Os needed to read N contiguous items from disk is scan(iV) = 0 (^) 
(the linear or scanning bound). The number of I/Os required to sort N items 
is sort(iV) = 0(^logjvf/B %) (the sorting bound) p. For all realistic values of 
N, B, and M, scan(fV) < sort(fV) <C N. Therefore the difference between the 
running times of an algorithm performing N I/Os and one performing scan(A^) 
or sort(fV) I/Os can be very significant |SI4j . 

I/O-efficient graph algorithms have been considered by a number of au- 
thors. For a review see and the references therein. We review the pre- 

vious results most relevant to our work. The best previously known gen- 
eral DFS algorithms on undirected graphs use 0((|U| -I- (|if |/R)) log 2 |U|) 
I/Os P2| or 0(|U| -I- (\V\/M) ■ (\E\/B)) I/Os |B|. Since the best known gen- 
eral BFS algorithm uses only 0{\V\ + (|if |/|U|)sort(|U|)) = 0{\V\ + sort(|if|)) 
I/Os pz], this suggests that on undirected graphs DFS might be harder than 
BFS. For directed graphs the best known algorithms for BFS and DFS both 



On External-Memory Planar Depth First Search 473 



use 0{{\V\ + \E\/B) ■ log(|T^|/-B) -I- sort(|i?|)) I/Os 0. In general we can- 
not hope to design algorithms that perform less than l7(min(|y|,sort(|F|))) 
I/Os for either of the two problems |2ISI17| . As mentioned above, in practice 
0(min(|y|,sort(|F|))) = 0(sort(|y |)). Still, all of the above algorithms use 
I2(|I4|) I/Os. For planar graphs this bound is matched by the standard internal 
memory algorithms. 

Recently, the first o{N) DFS and BFS algorithms for undirected planar 
graphs were developed |T^. These algorithms use -h sort(iVi3'>')) I/Os 

and 0{NB~^) space, for any 0 < 7 < 1/2. Further improved algorithms have 
been developed for special classes of planar graphs. For trees, 0(sort(7V)) I/O 
algorithms are known for both BFS and DFS — as well as for Euler tour compu- 
tation, expression tree evaluation, topological sorting, and several other prob- 
lems Km . BFS and DFS can also be solved in 0{sort{N)) I/Os on outerplanar 
graphs ^ and on fc-outerplanar graphs Developing 0(sort(fV)) I/O DFS 
and BFS algorithms for arbitrary planar graphs is a challenging open problem. 



1.2 Our Results 

The contribution of this paper is two-fold. In Sec. 0 we present a new DFS 
algorithm for undirected embedded planar graphs that uses O {sort (N) log N) 
I/Os and linear space. For most practical values of B, M and N this algorithm 
uses o{N) I/Os and is the first such algorithm using linear space. The algorithm 
is based on a divide-and-conquer approach first proposed in m- It utilizes a 
new 0(sort(iV)) I/O algorithm for finding a simple cycle in a biconnected planar 
graph such that neither the subgraph inside nor the one outside the cycle contains 
more than a constant fraction of the edges of the graph. Previously, no such 
algorithm was known. 

In Sec. 0 we use ideas similar to the ones utilized in to obtain an 
0(sort(A^)) I/O reduction from DFS to BFS on undirected embedded planar 
graphs. Contrary to what has been conjectured for general graphs, this shows 
that for planar graphs BFS is as hard as DFS. A recent paper shows that given 
a BFS-tree of a planar graph, the single source shortest path problem as well 
as the multi-way separation problems can be solved in 0(sort(A^)) I/Os 0. To- 
gether, these results suggest that BFS may indeed be a universally hard problem 
for planar graphs. That is, if planar BFS can be performed I/O-efficiently, most 
other problems on planar graphs can also be solved I/O-efRciently. 

2 DFS Using Simple Cycle Separators 

2.1 Outline of the Algorithm 

Our O (sort (A) log A) I/O and linear space algorithm for computing a DFS tree 
of a planar graph is based on a divide-and-conquer approach proposed in CHI. 

A cutpoint of a graph G is a vertex whose removal disconnects G. We first 
consider the case where G is biconnected, i.e., does not contain any outpoints. In 
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Fig. 1. The path P’ is shown in bold. Components are shaded dark grey. Medium 
edges are edges {ui,Vi}. Light edges are non-tree edges. 



Sec. XZ.'ZX -we, show that for a biconnected planar graph G we can compute a simple 
cycle a-separator in 0(sort(fV)) I/Os (Thm.Ej). A simple cycle a-separator C of 
G is a simple cycle such that neither the subgraph inside nor outside the cycle 
contains more than a\E\ edges. The main idea of our algorithm is to partition 
G using a simple cycle a-separator, for some constant 0 < a < 1, recursively 
compute DFS-trees for the connected components of G \ G, and combine them 
to obtain a DFS-tree for G. Given that each recursive step can be realized in 
0(sort(A^)) I/Os, the whole algorithm takes 0(sort(A^) log TV) I/Os. 

In more detail, we construct a DFS tree T of a biconnected embedded planar 
graph G, rooted at some vertex s as follows (see Fig. First we compute a 
simple cycle a-separator G of G in 0{sort{N)) I/Os. Then we find a path P 
from s to some vertex u in G by computing a spanning tree T' of G and finding 
the closest vertex to s in G along T' . This also takes 0(sort(iV)) I/Os jSj. Next 
we extend P to a path P' containing all vertices in P and G. To do so we 
identify the counterclockwise neighbor w G G of v, relative to the last edge on 
P, remove edge {u,w} from G, rank the resulting path to obtain the clockwise 
order of the vertices in G, and finally concatenate P with the resulting path. All 
these steps can be performed in 0(sort(iV)) I/Os |H|. We compute the connected 
components Hi , . . . , Hk of G \ P' in 0(sort(A^)) I/Os 0. For each component 
Hi, we find the vertex Vi G P' furthest away from s along P' such that there is 
an edge {ui,Vi}, Ui G Hi. This can easily be done in 0(sort(A^)) I/Os. Next we 
recursively compute DFS trees Pi, . . . , for components Hi , . . . , H]^ and obtain 
a DFS tree P for G as the union of trees Pi, . . . , Pfc, path P', and edges {ui, Vi}, 
1 < i < k. Note that components Hi,. .. ,Hk are not necessarily biconnected. 
Below we show how to deal with this case. 

To see that P is indeed a DFS tree, first note that there are no edges between 
components Hi, . . . , Hk. For every non-tree edge {u, w} connecting a vertex v 
in a component Hi with a vertex w in P', u is a descendant of Ui and, by the 
choice of Ui, w is an ancestor of Vi. Thus all non-tree edges in G are back-edges, 
and P is a DFS tree. 

We handle the case where G is not biconnected by finding the biconnected 
components or bicomps (i.e., the maximal biconnected subgraphs) of G, com- 
puting a DFS tree for each bicomp and joining them at the outpoints. More 
precisely, we compute the bicomp-cutpoint-tree Tq of G containing all outpoints 
of G and one vertex v{C) per bicomp G. There is an edge between a outpoint v 
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and a bicomp vertex v(C) if v is contained in C. We choose a bicomp Cr contain- 
ing vertex s as the root of Tq- The parent cutpoint of a bicomp C is the parent 
p{v{C)) of v{C) in Tq. The parent bicomp of C is the bicomp C corresponding 
to v{C) = p{p{v{C))). Tq can be constructed in 0(sort(A^)) I/Os [3- We com- 
pute a DFS tree of Cr rooted at vertex s. In all other bicomps C, we compute 
a DFS tree rooted at the parent cutpoint of C. The union of the resulting DFS 
trees is a DFS tree for G rooted at s, as there are no edges between different 
bicomps. Thus, we obtain our first main result. 

Theorem 1. A DFS tree of an embedded planar graph can be computed in 
0{sort{N)log N) I/O operations and linear space. 

2.2 Finding a Simple Cycle Separator 

Utilizing ideas similar to the ones used in [I 1 1 1 tij we now show how to compute 
a simple cycle |-separator for a planar biconnected graph. 

Given an embedded planar graph G, the faces of G are the connected regions 
of \ G. We use F to denote the set of faces of G. The boundary of a face / is 
the set of edges contained in the closure of /. For a set F' of faces of G, let Gp' 
be the subgraph of G defined as the union of the boundaries of the faces in F' . 
The complement Gpi of Gpi is the graph obtained as the union of boundaries of 
all faces in F\F'. The boundary of Gp' is the intersection between Gpr and its 
complement Gp/. The dual G* of G is the graph containing one vertex /* per 
face f G F, and an edge between two vertices ff and f^ if faces /i and /2 share 
an edge. We use u*, e*, and /* to refer to the face, edge, and vertex which is 
dual to vertex v, edge e, and face /, respectively. The dual G* of a planar graph 
G is planar and can be computed in 0(sort(iV)) I/Os m- 

The main idea in our algorithm is to find a set of faces F' C F such that the 
boundary of Gpr is a simple cycle |-separator. The main difficulty is to ensure 
that the boundary of Gpr is a simple cycle. We compute F' as follows: First we 
check whether there is a single face whose boundary has size at least ^ (Fig.|2^). 
If we find such a face, we report its boundary as the separator G . Otherwise, 
we compute a spanning tree T* of the dual G* of G rooted at an arbitrary node 
r. Every node v G T* defines a maximal subtree T*{v) of T* rooted at v. The 
nodes in this subtree correspond to a set of faces in G whose boundaries define 
a graph G{v). Below we show that the boundary of G{v) is a simple cycle in 
G. We try to find a node v such that ^\E\ < |G(u)| < ||E|, where |G(u)| is 
the number of edges in G{v) (Fig. Eb). If we succeed, we report the boundary 
of G{v). Otherwise, we are left in a situation where for every leaf I G T* (face 
in G*) we have |G(I)| < g|G|, for the root r of T* we have |G(r)| = \E\, and 
for every other vertex v G T* either |G(u)| < ^\E\ or |G(u)| > ||E|. Thus, 
there has to be a node v with |G(u)| > ||if| and |G(wi)| < ||E|, for all children 
w\, . . . ,Wk of u. We show how to compute a subgraph G' of G{v) consisting of 
the boundary of the face v* and a subset of the graphs G(wi), . . . , G(wu) such 
that \\E\ < |G'| < ||G|, and the boundary of G' is a simple cycle (Fig. Et)- 
Below we describe our algorithm in detail and show that all of the above steps 
can be performed in 0(sort(fV)) I/Os. This proves the following theorem. 



476 L. Arge et al. 






Fig. 2. (a) A heavy face, (b) A heavy subtree, (c) Splitting a heavy subtree. 

Theorem 2. A simple cycle ^-separator of an embedded biconnected planar 
graph can be computed in 0{sort{N)) I/O operations and linear space. 



Checking for heavy faces. In order to check if there exists a face / in G 
with a boundary of size at least ^\E\, we represent each face of G as a list of 
vertices along its boundary. Computing such a representation takes 0(sort(A^)) 
I/Os ^01 • Then we scan these lists to see whether any of them has length at 
least ^\E\. 



Checking for heavy subtrees. First we prove that the boundary of G{v) 
defined by the nodes in T*{v) is a simple cycle. A planar graph is uniform if 
its dual is connected. Since for every v € T*, T*{v) and T* \ T*{v) are both 
connected, G(u) and its complement G{v) are both uniform. Using the following 
lemma, this implies that the boundary of G(u) is a simple cycle. 

Lemma 1 (Smith [18J L Let G' be a subgraph of a biconnected planar graph 
G. The boundary of G' is a simple cycle if and only if G' and its complement 
are both uniform. 

Next we show how to find a node v G T* such that ^\E\ < |G(u)| < ||A|. 
G* and T* can both be computed in 0(sort(A^)) I/Os |ll)fiSj . For every node 
V G T*, let |u*| be the number of edges on the boundary of face v* . Let the 
weight uj{G{v)) of subgraph G{v) be defined as w(G(u)) = 1^*1- 

lo{G{v)) = |f* ‘^(G(wi)), where W\, . . . ,Wk are the children of v in T*, we 

can process T* bottom-up to compute the weights of all subgraphs G{v). Using 
time-forward processing |B|, this takes 0(sort(iV)) I/Os. For a node v in T* every 
boundary edge of G{v) is counted once in tu(G(u)); every interior edge is counted 
twice. This implies that if ||G| < uj(G(v)) < \\E\, then \\E\ < |G(u)| < ||G|. 
Thus, we can find a node v G T* with ^\E\ < |G(u)| < ||G| in 0(scan(A^)) 
I/Os by scanning through the list of nodes T* and finding a node v such that 
^\E\ < u>{G{v)) < §|G|, if such a node existsQ 

^ Note that even if a node v with ||if| < |G(u)| < ||i?| exists in T* , the algorithm 
might not find it since it does not follow that ||A| < u){G{v)) < ||A|. This is not 
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Fig. 3. ( a) The boundary of v* U G{w 3 .) is not a simple cycle, (b) Grey regions are in 



Splitting a heavy subtree. We are now in a situation where no vertex v € T* 
satisfies ||i?| < uj{G{v)) < ^\E\. Thus, there must be a vertex v € T* with 
children wi, . . . ,Wk such that uj(G(v)) > \\E\ and uj{G{wi)) < for 1 < i < 

k. Our goal is to compute a subgraph of G{v) consisting of the boundary v* and 
a subset of the graphs G{wi) whose weight is between ^\E\ and ^\E\ and whose 
boundary is a simple cycle G. 

In dH it is claimed that the boundary of the graph defined by v* and any 
subset of graphs G{wi) is a simple cycle. Unfortunately this is not true in gen- 
eral, as illustrated in Fig. 0a). However, as we show below, we can compute a 
permutation a \ {1 . . . fc} — >■ {1 . . . fc} such that if we start with v* and incre- 
mentally “glue” G(rCcr(i)), G{w„( 2 )), - G{w„(^k-^) onto face v*, the boundary 

of each of the obtained graphs is a simple cycle. More formally, we show that if 
we define E[„{i) = ?;* U Uj=i then E[„{i) and E[„{i) are both uniform 

for all 1 < i < fc. This implies that the boundary of Ha{i) is a simple cycle 
by Lemma ^ Since we have already computed the sizes |u*| of faces v* and 
the weights lu(G{v)) of all graphs G(u), it takes 0(scan(iV)) I/Os to compute 
weights uj{Ha{i)), 1 < * < fc, and find index i such that ^|G| < < ^\E\. 

It remains to show how to compute the permutation a I/O-efficiently. 

To construct a, we extract G{v) from G, label v* with 0, and label every 
face in G{wi) with i. Next we label every edge in G{v) with the labels of the 
two faces on each side of it. We perform the labeling in 0(sort(fV)) I/Os using 
the previously computed representations of G and G* and a post-order traversal 
of T* . Details will appear in the full paper. Now consider the vertices v\, . . . ,Vt 
on the boundary of v* in the order they appear clockwise around v*, starting 
at the common endpoint of an edge shared by v* and the face corresponding to 
v’s parent p{v) in T* . As in Sec. El we can compute this order in 0(sort(A^)) 
I/Os using list ranking. For each Vi we construct a list Li of edges around Vi in 

a problem, however, since in this case a simple cycle |-separator will still be found 
in the final phase or our algorithm. In the full paper we discuss how to modify the 
algorithm in order to compute |G(u)| exactly for each node v £ T* . This allows us 
to find even a simple cycle |-separator. 
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clockwise order, starting with edge {vi-i,Vi} and ending with edge {vi^Vi+i}. 
These lists are easily computed in 0 {sort{N)) I/Os from the embedding of G. 
Let L be the concatenation of lists Li, L2, • . . , L*- For an edge e in L incident 
to a vertex Vi, let /i and /2 be the two faces incident to e, where /i precedes 
/2 in clockwise order around Vi. We construct a list F of face labels from L in 
0 (scan(A^)) I/Os by considering the edges in L in order and appending the labels 
of /i and /2 in this order to F. List F consists of integers between 1 and k. Some 
integers may appear more than once, and the occurrences of some integer i are 
not necessarily consecutive. (This happens if the union of v* with a subgraph 
G(wi) encloses another subgraph G{wj); see Fig. 0 (a).) We construct a final list 
S by removing all but the last occurrence of each integer from F (Intuitively, this 
ensures that if the union of v* and G{wi) encloses another subgraph G{wj), then 
j appears before i in S'). This takes 0 {soit{N)) I/Os by sorting and scanning F 
twice. Again details will appear in the full paper. S contains each of the integers 
1 through k exactly once and thus defines a permutation cr. All that remains is 
to prove the following lemma. 

Lemma 2 . For all 1 < i < k, H„(i) and Hfj(i) are both uniform. 

Proof. Every graph Hg-ii) is uniform because every subgraph G(wj) is uniform 

and Wj is connected to v by an edge in G*. Next we show that every Hu(i) is 

* 

uniform. To do this we must show that every Ha(i) is connected. Note that 

G{v) C G{v) is uniform, and each graph G(wj) is uniform. Hence, in 

* 

order to prove that Flcr(,i) is connected, it suffices to show that for all i < j < k, 

^ 

there is a path in Hu{i) connecting a vertex in G{wj)* to a vertex in G{v) . So 

assume for the sake of contradiction that there is a graph G{wj), i < j < k, such 
that there is no such path from a vertex in G{wj)* to a vertex in G{v) in H„(i) 
(Fig. 0 (b)). Let G be the uniform component of H„{i) containing G{wj), and C 
be the boundary cycle of G. Let P be the path obtained by removing the edges 
shared by v* and p{v)* from the boundary cycle of v* . Let v\ be the first vertex 
of G encountered during a clockwise walk along P; let V2 be the last such vertex. 
We define P' to be the path obtained by walking clockwise around G starting at 
v\ and ending at V2. Let e\ be the first and 62 be the last edge on P' . Edge e\ 
separates two faces fi £ H^{i) and /2 G Hdi). Similarly, edge 62 separates two 
faces /a G H^{i) and /4 G Let ji,j2,j3 and j'4 be the labels of these faces. 

We show that label j2 appears before label j'4 in S: Assume that label j2 appears 
after label jh in S. Then there has to be a face /' with label j2 occurring after 
face /4 clockwise around v* . In particular, face /' is outside cycle G, while face 
/2 is inside. As G*{wj^) is connected there has to be a path from /'* to /2 in 
G*{wj2). But this is not possible since every path from /| to /'* must contain 
an edge e* , for some edge e G C, and edge e* cannot be in G*{wj^) because 
one of its endpoints is in H*{i). Therefore it follows that label j2 appears before 
label j4 in S. But this means means that /2 is being added to H^{i) before fi, 
contradicting the assumption that /2 G H^{i) and fi G Hr,{i). □ 
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Fig. 4. (a) A graph G with its faces colored according to their levels; level 0 white, 
level 1 light grey, level 2 dark grey, (b) Ho (solid), Hi (dotted), H 2 (dashed). 



3 Reducing DFS to BFS 

This section gives an I/O-efhcient reduction from DFS in an embedded planar 
graph G to BFS in its vertex- on- face graph, using ideas from The vertex-on- 
face graph of G is defined as follows: The vertex set of GMs F U F*; there 
is an edge (n, /*) in G^ if V is on the boundary of face /. The graph G^ can be 
computed from G in 0(sort(iV)) I/Os in a way similar to the computation of the 
dual G* of G. We use the vertex-on-face graph instead of the graph used in , 
because the vertex-on-face graph of an embedded planar graph G is planar. This 
could be important in case planar BFS turns out to be easier than general BFS. 

The basic idea in our algorithm is to partition the faces of G into levels 
around a source face with the source s of the DFS tree on its boundary, and then 
grow a DFS tree level- by-level; Let the source face be at level 0. We partition 
the remaining faces of G into levels so that all faces at level 1 share a vertex 
with the level-0 face, all faces at level 2 share a vertex with some level-1 face 
but not with the level-0 face, and so on (Fig. EJl). Let Gi be the subgraph of 
G defined by the union of the boundaries of faces at level at most i, and let 
Hi = Gi \ Gi-i (Fig. EId). We call the vertices of Hi level-i vertices. To grow 
the DFS tree we start by walking clockwis^ around the level-0 face Gq until 
we reach the counterclockwise neighbor of s on Gq. The resulting path is a DFS 
tree Tq for Gq. Next we build a DFS tree for Hi and attach it to Tq in a way 
that does not introduce cross-edges, thereby obtaining a DFS tree Ti for Gi. 
We repeat this process until we have processed all layers Hi. The key to the 
efficiency of the algorithm lies in the simple structure of the graphs Hi. Below 
we give the details of our algorithm and prove the following theorem. 

Theorem 3. Let G be an undirected embedded planar graph, G^ its vertex-on- 
face graph, and f a face of G* containing the source vertex s. Given a BFS tree 
of G^ rooted at f* , a DFS tree of G rooted at s can be computed in 0{sort{N)) 
I/Os. 



A clockwise walk on the boundary of a face means walking so that the face is to our 
right. 
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(a) (b) 

Fig. 5. (a) shown in bold, (b) Ti, H 2 and attachment 
are labeled with their DFS-depths. (c) The DFS tree. 




Corollary 1. If there is an algorithm that computes a BFS tree of a planar 
graph inI{N) I/Os using S{N) space, then DFS on planar graphs takes 0{I{N)) 
I/Os and 0{S{N)) space. 

First consider the computation of graphs Gi and Hi. The level of all faces can 
be obtained from a BFS tree for the vertex-on-face graph rooted at a face 
containing s (Fig. 0a)). Every vertex of G is at an odd level in the BFS tree; 
every dual vertex corresponding to a face of G is at an even level. The level of a 
face is the level of the corresponding vertex in the BFS tree divided by two. Given 
the levels of all faces, the graphs Gi and Hi can be computed in 0(sort(iV)) I/Os 
using standard techniques similar to the ones used in computing G* from G. 

Now assume that we have computed a DFS tree Ti_i for Gi-±. Our goal is to 
compute a DFS forest for Hi and link it to T^_i without introducing cross-edges. 
If we can do so in 0(sort(|i7i|)) I/Os we obtain an 0(sort(A^)) I/O reduction 
from planar DFS to planar BFS. Note that the entire graph Hi lies “outside” the 
boundary of Gi_i, i.e., in Gi-i. The boundary of Gi-i is in Hi-i and consists of 
cycles, called the boundary cycles of Gi-i. The graph Gi-i is uniform; but Gi-i 
may not be uniform. Graph Hi may consist of several connected components. 
The following lemma shows that Hi has a simple structure, which allows us to 
compute its DFS tree efficiently. 

Lemma 3. The non-trivial bicomps of Hi are the boundary cycles of Gi. 

Proof. Gonsider a cycle G in Hi. All faces incident to C are at level i or greater. 
Since Gi-i is uniform, all its faces are either inside or outside G. Assume w.l.o.g. 
that Gi-i is inside G. Then none of the faces outside G shares a vertex with a 
level-(j— 1) face. That is, all faces outside G must be at level at least i+1, which 
means that G is a boundary cycle of G^. Thus any cycle in Hi is a boundary cycle 
of Gi. Every bicomp that is not a cycle consists of at least two cycles sharing at 
least two vertices; but the cycles must be boundary cycles, and two boundary 
cycles of a uniform graph cannot share two vertices. Hence every bicomp is a 
cycle and thus a boundary cycle. □ 

Assume for the sake of simplicity that the boundary of Gi_i is a simple cycle, 
so that Gi-i is uniform. During the construction of the DFS tree for G we 
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maintain the following invariant used to prove the correctness of the algorithm: 
For every boundary cycle C of Gi_i, there is a vertex u on G such that the path 
traversed by walking clockwise along G is a path in Ti_i, and v is an ancestor 
of all vertices in G (Figure 03). The depth of a vertex in Gi_i is its distance 
from s in Ti_i. Let H[,. . . ,iL(. be the connected components of Hi. They can 
be found in 0(sort(|iJi|)) I/Os 0. For every component iL', we find the deepest 
vertex Vj on the boundary of Gi-\ such that there is an edge {uj,Vj} G G with 
Uj G H'j. We find these vertices using a procedure similar to the one used in 
Sec.O Below we show how to compute DFS trees Tj for components 7L' rooted 
at nodes Uj in 0(sort(|iF' |)) I/Os. Let Ti be the spanning tree of Gi obtained 
by adding these DFS trees and all edges {uj,Vj} to Ti_i. Ti is a DFS tree for 
Gi'. Let {v,w} be a non tree edge with u G iL'. Then either w G iL', or re is a 
boundary vertex of Gi_i because Hi C G \ Gi-i. In the former case, {u, w} is a 
back-edge, as T' is a DFS tree for iLj. In the latter case, {u,rc} is a back-edge 
because u is a descendant of Uj, and w is an ancestor of Vj, by the choice of Vj 
and by our invariant. 

All that remains to show is how to compute the DFS tree rooted at Uj 
for each connected component iLj of Hi. If we can compute DFS trees for the 
biconnected components of iL', we obtain a DFS tree for Hj using the bicomp- 
cutpoint tree as in Sec. El By Lemma El the non-trivial biconnected components 
of Hi are cycles. Let G be such a cycle in iJ', and v be the chosen root for 
the DFS tree of G. The path obtained after removing the edge between v and 
its counterclockwise neighbor w along G is a DFS tree for G. We find w using 
techniques similar to those applied in Sec. [2. In total we compute the DFS tree 
for Hj in 0(sort(|iL' |)) I/Os. As this adds simple paths along the boundary 
cycles of Gi to Ti, the above invariant is preserved. 

For the sake of simplicity all the previous arguments were based on the 
assumption that the boundary of Gi_i is a simple cycle. In the general case we 
compute the boundary cycles Gi, . . . , Gfc of Gi-\ and apply the above algorithm 
to every Cj. Each cycle Gj is the boundary of a uniform component G' of Gi-i. 
Thus, cycles Gi, . . . , Gfc separate subgraphs Hij = Hi (1 G' from each other. 
Details will appear in the full paper. This concludes the proof of Thm. 0. 



4 Conclusions 

We developed the first o{N) and linear space algorithm for DFS in planar graphs. 
We also designed an 0(sort(fV)) reduction from planar DFS to planar BFS, 
proving that in external memory DFS is not harder than BFS and thus providing 
further evidence that BFS is among the hardest problems for planar graphs. 

Adding the single source shortest path algorithm of ^ as an intermediate 
reduction step, we can modify our reduction algorithm in order to reduce planar 
DFS to BFS on either a planar triangulated graph or a planar 3-regular graph. 
Developing an efficient BFS algorithm for one of these classes of graphs remains 
an open problem. 
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